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(57) Abstract: A unique HCV RNA molecule is provided having an enhanced efficiency of establishing cell culture replication. 

^ Novel adaptive mutations have been identified within the HCV non-structural region that improves the efficiency of establishing 

^> persistently replicating HCV RNA in cell culture. This self-replicating polynucleotide molecule contains, contrary to all previous 
reports, a 5'-NTR that can be either an A as an alternative to the G already disclosed and therefore provides an alternative to ex- 
isting systems comprising a self-replicating HCV RNA molecule. The G->A mutation gives rise to HCV RNA molecules that, 

Q in conjunction with mutations in the HCV non-structural region, such as the G(2042)C/R mutations, possess greater efficiency of 
transduction and/or replication. These RNA molecules when transfected in a cell line are useful for evaluating potential inhibitors 

^ of HCV replication. 
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■ 

SELF-REPLICATING RNA MOLECULE FROM HEPATITIS C VIRUS 

Field of the invention 

The present invention relates generally to a HCV RNA molecule that self-repiicates 
5 in appropriate cell lines, particularly to a self-replicating HCV RNA construct having 
an enhanced efficiency of establishing cell culture replication. 

Background of the Invention 

Hepatitis C virus (HCV) is the major etiological agent of post-transfusion and 
10 community-acquired non-A non-B hepatitis worldwide. It is estimated that over 200 
million people worldwide are infected by the virus. A high percentage of carriers 
become chronically infected and many progress to chronic liver disease, so called 
chronic hepatitis C. This group is in turn at high risk for serious liver disease such as 
liver cirrhosis, hepatocellular carcinoma and terminal liver disease leading to death. 
15 The mechanism by which HCV establishes viral persistence and causes a high rate 
of chronic liver disease has not been thoroughly elucidated. It is not known how 
HCV interacts with and evades the host immune system. In addition, the roles of 
cellular and humoral immune responses in protection against HCV infection and 
disease have yet to be established. 

20 

Various clinical studies have been conducted with the goal of identifying 
pharmaceutical compounds capable of effectively treating HCV infection in patients 
afflicted with chronic hepatitis C. These studies have involved the use of interferon- 
alpha, alone and in combination with other antiviral agents such as ribavirin. Such 
25 studies have shown that a substantial number of the participants do not respond to 
these therapies, and of those that do respond favorably, a large proportion were 
found to relapse after termination of treatment. To date there are no broadly 
effective antiviral compounds for treatment of HCV infection. 

■ 

30 HCV is an enveloped positive strand RNA virus in the Flaviviridae family. The single 
. strand HCV RNA genome is of positive polarity and comprises one open reading 
frame (ORF) of approximately 9600 nucleotides in length, which encodes a linear 
polyprotein of approx. 3010 amino acids. In infected cells, this polyprotein is cleaved 
at multiple sites by cellular and viral proteases to produce structural and non- 
35 structural (NS) proteins. The structural proteins (C, E1 , E2 and E2-p7) comprise 
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polypeptides that constitute the virus particle (Hijikata et a/., 1991 ; Grakoui et a/. f 
1993(a) ). The non-structural proteins (NS2, NS3, NS4A, NS4B, NS5A, NS5B) 
encode for enzymes or accessory factors that catalyze and regulate the replication of 
the HCV RNA genome. Processing of the structural proteins is catalyzed by host 
5 cell proteases (Hijikata et a/., 1991). The generation of the mature nonstructural 
proteins is catalyzed by two virally encoded proteases. The first is the NS2/3 zinc- 
dependent metalloprotease which auto-catalyses the release of the NS3 protein from 
the polyprotein. The released NS3 contains a N-terminal serine protease domain 
(Grakoui et a/. t 1993(b);. Hijikata ef a/., 1993) and catalyzes the remaining cleavages 

10 from the polyprotein. The released NS4A protein has at least two roles. First, 
forming a stable complex with NS3 protein and assisting in the membrane 
localization of the NS3/NS4A complex (Kim et a/. f Arch Virol. 1999, 144: 329-343) 
and second, acting as a cofactor for NS3 protease activity. This membrane- 
associated complex, in turn catalyzes the cleavage of the remaining sites on the 

15 polyprotein, thus effecting the release of NS4B, NS5A and NS5B (Bartenschlager ef 
a/., 1993; Grakoui ef a/., 1993(a); Hijikata et a/., 1993; Love et a/., 1996; reviewed in 
Kwong et a/., 1998). The C-terminal segment of the NS3 protein also harbors 
nucleoside triphosphatase and RNA helicase activity (Kim ef a/., 1995). The function 
of the protein NS4B is unknown. NS5A, a highly phosphorylated protein, seems to 

20 be responsible for the Interferon resistance of various HCV genotypes (Gale Jr. et a/. 
1997 Virology 230, 217; Reed et a/., 1997. NS5B is an RNA-dependent RNA 
polymerase (RdRp) that is involved in the replication of HCV. 

The open reading frame of the HCV RNA genome is flanked on its 5' end by a non- 
25 translated region (NTR) of approx. 340 nucleotides that functions as the internal 
ribosome entry site (IRES), and on its 3' end by a NTR of approximately 230 
nucleotides. Both the 5' and 3* NTRs are important for RNA genome replication. The 
genomic sequence variance is not evenly distributed over the genome and the 
5'NTR and parts of the 3'NTR are the most highly conserved portions. The authentic, 
30 highly conserved 3'NTR is the object of US patent 5,874,565 granted to Rice et a/. 

The cloned and characterized partial and complete sequences of the HCV genome 
have also been analyzed with regard to appropriate targets for a prospective antiviral 
therapy. Four viral enzyme activities provide possible targets such as (1) the NS2/3 
35 protease; (2) the NS3/4A protease complex, (3) the NS3 Helicase and (4) the NS5B 
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RNA-dependent RNA polymerase. The NS3/4A protease complex and the NS3 
helicase have already been crystallized and their three-dimensional structure 
determined (Kim et a/., 1996; Yem ef a/., 1998; Love et a/. f 1996; Kim ef a/., 1998; 
Yao ef a/., 1997; Cho et a/., 1998). The NS5B RNA dependent RNA polymerase has 
5 also been crystallized to reveal a structure reminiscent of other nucleic acid 

polymerases (Bressanelli ef al. 1999, Proc. Natl. Acad. Sci, USA 96: 13034-13039; 
:Ago era/. 1999, Structure 7: 1417-1426; Lesburg etal. 1999, Nat Struct. Biol. 6: 
937-943). 

1 0 Even though important targets for the development of a therapy for chronic HCV 
infection have been defined with these enzymes and even though a worldwide 
intensive search for suitable inhibitors is ongoing with the aid of rational drug design 
and HTS, the development of therapy has one major deficiency, namely the lack of 
cell culture systems or simple animal models, which allow direct and reliable 

15 propagation of HCV viruses. The lack of an efficient cell culture system is still the 
main reason to date that an understanding of HCV replication remains elusive; 

Although flavi- and pestivirus self-replicating RNAs have been described and used 
for the replication in different cell lines with a relatively high yield, similar experiments 

20 with HCV have not been successful to date (Khromykh ef a/., 1997; Behrens et a/., 
1998; Moser ef a/., 1998). It is known from different publications that cell lines or 
primary cell cultures can be infected with high-titer patient serum containing HCV 
(Lanford etal. 1994; Shimizu ef a/. 1993; Mizutani ef a/. 1996; Ikda ef a/. 1998; 
Foumer et al. 1998; Ito etal. 1996). However, these virus-infected cell lines or cell 

25 cultures do not allow the direct detection of HCV-RNA or HCV antigens. 

It is also known from the publications of Yoo ef al. 1995; and of Dash ef a/., 1997; 
that hepatoma cell lines can be transfected with synthetic HCV-RNA obtained 
through in vitro transcription of the cloned HCV genome. In both publications the 

30 authors started from the basic idea that the viral HCV genome is a plus-strand RNA 
functioning directly as mRNA after being transfected into the cell, permitting the 
synthesis of viral proteins in the course of the translation process, and so new HCV 
particles could form HCV viruses and their RNA detected through RT-PCR. 
However the published results of the RT-PCR experiments indicate that the HCV 

35 replication in the described HCV transfected hepatoma cells is not particularly 
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efficient and not sufficient to measure the quality of replication, let alone measure the 
modulations in replication after exposure to potential antiviral drugs. Furthermore it 
is now known that the highly conserved 3' NTR is essential for the virus replication 
(Yanagi et a/., 1999). This knowledge strictly contradicts the statements of Yoo et al 
5 (supra) and Dash et al. (supra), who used for their experiments only HCV genomes 
with shorter 3* NTRs and not the authentic 3' end of the HCV genome. 

In WO 98/39031, Rice et a/, disclosed authentic HCV genome RNA sequences, in 
particular containing: a) the highly conserved 5-terminal sequence "GCCAGCC"; b) 
10 the HCV polyprotein coding region; and c) 3'-NTR authentic sequences. 

In WO 99/04008, Purcell et al. disclosed an HCV infectious clone that also contained 
only the highly conserved 5-terminal sequence "GCCAGC". 

15 Recently Lohman et ai 1999 (Science 285: 110-1 13) and Bartenschlager et al. (in 
CA 2,303,526, laid-open on October 3, 2000) disclosed a HCV cell culture system 
where the viral RNA (I377/NS2-3') self-replicates in the transfected cells with such 
efficiency that the quality of replication can be measured with accuracy and 
reproducibility. The Lohman and Bartenschlager disclosures were the first 

20 demonstration of HCV RNA replication in cell culture that was substantiated through 
direct measurement by Northern blots. This replicon system and sequences 
disclosed therein highlight once again the conserved 5' sequence "GCCAGC". A 
similar observation highlighting the conservation of the 5'NTR was made by Blight et 
al. 2000 (Science 290: 1972-1974) and WO 01/89364 published on Nov. 29, 2001. 

25 

In addition to the conservation of the 5' and 3' untranslated regions in cell culture 
replicating RNAs, three other publications by Lohman et al. 2001, Krieger et al. 2001 
and Guo etai 2001 have recently disclosed distinct adaptive mutants within the 
HCV non-structural protein coding region. Specific nucleotide changes that alter the 
30 amino acids of the HCV non-structural proteins are shown to enhance the efficiency 
of establishing stable replicating HCV subgenomic replicons in culture cells. 

Applicant has now found that, contrary to all previous reports, the highly conserved ' 
5-NTR can be mutated by adaptation to give rise to a HCV RNA sequence that, in 
35 conjunction with mutations in the HCV non-structural region, provides for a greater 
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efficiency of transduction and/or replication. 

Applicant has also identified novel adaptive mutations within the HCV non-structural 
region that improves the efficiency of establishing persistently replicating HCV RNA 
5 in cell culture. 

One advantage of the present invention is to provide an alternative to these existing 
systems comprising a HCV RNA molecule that self-replicates. Moreover, the present 
invention demonstrates that the initiating nucleotide of the plus-strand genome can 
10 be either an A as an alternative to the G already disclosed. 

A further advantage of the present invention is to provide a unique HCV RNA 
molecule that transduces and/or replicates with higher efficiency. The Applicant 
demonstrates the utility of this specific RNA molecule in a cell line and its use in 
15 evaluating a specific inhibitor of HCV replication. 

Summary of the invention 

In a first embodiment, the present invention provides a 5-non translated region of 
20 the hepatitis C virus wherein its highly conserved guanine at position 1 is substituted 
for adenine. 

Particularly, the present invention provides a hepatitis C virus polynucleotide 
comprising adenine at position 1 as numbered according to the I377/NS2-3' 
25 construct (Lohmann et al. 1999, Accession # AJ242651). 

Particularly, the invention provides a HCV self-replicating polynucleotide comprising 
a 5'-terminus consisting of ACCAGC (SEQ ID NO. 8). 

30 In a second embodiment, the present invention is directed to a HCV self-replicating 
polynucleotide encoding a polyprotein comprising one or more amino acid 
substitution selected from the group consisting of: R(1 135)K; S(1 148)G; S(1560)G; 
K(1691)R; L(1701)F; l(1984)V; T(1993)A; G(2042)C; G(2042)R; S(2404)P; 
L(2155)P; P(2166)L and M(2992)T. 
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Particularly, the invention is directed to a HCV self-replicating polynucleotide 
encoding a polyprotein comprising the any one of the amino acid substitutions as 
described above, further comprising the amino acid substitution E(1202)G. 

5 More particularly, the invention provides a HCV self-replicating polynucleotide 
encoding a polyprotein comprising a G2042C or a G2042R mutation. 

Most particularly, the invention provides for HCV self-replicating polynucleotide 
comprising a nucleotide substitution G->A at position 1, and said polynucleotide 
10 encodes a polyprotein further comprising a G2042C or a G2042R mutation. 

Particularly, the polynucleotide of the present invention can be in the form of RNA or 
DNA that can be transcribed to RNA. 

15 In a third embodiment, the invention also provides for an expression vector 

comprising a DNA form of the above polynucleotide, operably linked with a promoter. 

According to a fourth embodiment, there is provided a host cell transfected with the 
self-replicating polynucleotide or the vector as described above. 

20 

In a fifth embodiment, the present invention provides a RNA replication assay 
comprising the steps of: 

- incubating the host cell as described above in the absence or presence of a 
potential hepatitis C virus inhibitor; 

25 - isolating the total cellular RNA from the cells; 

- analyzing the RNA so as to measure the amount of HCV RNA replicated; 

- comparing the levels of HCV RNA in cells in the absence and presence of the 
inhibitor. 

30 In a sixth embodiment, the invention is directed to a method for testing a compound 
for inhibiting HCV replication, including the steps of: 

a) treating the above described host cell with the compound; 

b) evaluating the treated host cell for reduced replication, wherein reduced 
replication indicates the ability of the compound to inhibit replication. 

35 
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DETAILED DESCRIPTION OF THE DRAWINGS 

Figure 1 is a schematic view of the bi-cistronic replicon RNA. The sequence 
deviations between the I377/NS2-3' replicon from Lohman et aL, 1999 and the 
5 APGK12 replicon are indicated below the replicon. In place of a G nucleotide at the 
+1 position in the l377/NS2-3'replicon, the APGK12 contains an additional G 
resulting in GG at the 5' terminus (the first G being counted as position -1). In the 
linker region between the neo gene and the EMCV IRES sequence two areas 
deviate from I377/NS2-3': 14 nucleotides (CGCGCCCAGATGTT) which are not 
10 present in I377/NS2/3 are inserted at position 1 1 84 in APGK1 2; 1 1 nucleotides 
(1231-1241) present in I377/NS2-3' are deleted to generate APGK-12. In the NS5B 
coding region, a T at position 8032 was mutated to C to eliminate a Ncol restriction 
site. 

15 Figure 2 shows Northern blots of RNA-transfected Huh-7 cell lines. 12 pg of total - 
cellular RNA or control RNA was separated on 0.5% agarose-formaldehyde gets 
and transferred to Hybond N+ paper, fixed and (Figure 2A) radioactively probed with 
HCV specific minus-strand RNA that detects the presence of plus-strand replicon 
RNA. Lanes 1 and 2: positive controls that contain 10 9 copies of in vitro transcribed 

20 APGK12 RNA. Lane 3: negative control of total cellular RNA from untransfected 
Huh-7 cells. Lanes 4 and 5: cellular RNA from B1 and B3 cell lines that have 
integrated DNA copies of the neomycin phosphotransferase gene. Lane 6: total 
cellular RNA from a Huh-7 cell line, designated S22.3, that harbors high copy 
number HCV sub-genomic replicon RNA as highlighted by the arrow. Other cell lines 

25 have no detectable replicon RNA. Figure 2B is identical to Figure 2A with the 

exception that the blot was radioactively probed with HCV specific plus-strand RNA 
to dieted the presence of HCV minus-strand RNA. Lanes 1 and 2 are positive control 
lanes that contain 10 9 copies of full length HCV minus strand RNA. Lane 6, which 
contains 12 pg of total cellular RNA from cell line S22.3, harbors detectable minus- 

30 strand replicon RNA at the expected size of 8 - 9 kilobases. M represent the 
migration of non-radioactive molecular size markers on the agarose gel. 28s 
represents the migration of 28s ribosomal RNA and accounts for the detection of this 
species in a samples of total cellular RNA. 

* 

35 Figure 3 shows indirect immunofluorescence of a HCV non-structural protein in the 
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S22.3 cell line. Indirect immunofluorescence was performed on cells that were 
cultured and fixed, permeabilized and exposed to a rabbit polyclonal antibody 
specific for a segment of the HCV NS4A protein. Secondary goat anti-rabbit antibody 
conjugated with red-fluor Alexa 594 (Molecular Probes) was used for detection. Top 
5 panels shows the results of immunofluorescence (40X objective) and the specific 
staining of the S22.3 cells. The bottom panels represent the identical field of cells 
viewed by diffractive interference contrast (DIC) microscopy. The majority of S22.3 
(Figure 3A) ceHs within the field stain positively for HCV NS4A protein that localizes 
in the cytoplasm, whereas the B1 cells (Figure 3B>that fail to express any HCV 
10 proteins, only have background level of staining. 

Figure 4 shows Western-blots following SDS-PAGE separation of total proteins 
^extracted from three cell lines: (i) naTve Huh-7 cell line, (ii) neomycin resistant Huh-7 
cell line B1, and (Hi) the S22.3 cell line. Panels A, B, and C, demonstrate the results 

15 of western blots probed with rabbit polyclonal antisera specific for neomycin 

phosphotransferase (NPT), HCV NS3, and HCV NS5B, respectively. Visualization - 
was achieved through autoradiographic detection of a chemiluminescent reactive 
secondary \ goat anti-rabbit antibody. Panel A shows that the S22.3 RNA replicon 
cell line, expresses the NPT protein at levels higher than control B1 cells and that 

20 the naive Huh-7 cell line does not produce the NPT protein. Panels B and C show 
that only the S22.3 cell line produces the mature HCV NS3 and NS5B proteins, 
respectively. M represents molecular weight (in kilodaltons) of pre-stained 
polypeptide markers. 

25 Figure 5A and 5B identify the nucleotide and amino acid sequences respectively 
that differ from the APGK12 sequence in the different HCV bi-cistronic replicons. 
The S22.3 adapted replicon is a first generation replicon selected following the 
transfection of RNA transcribed from the APGK12 template. R3, R7, R16 are second 
generation replicons that were selected following the transfection of RNA isolated 

30 from the S22.3 first generation replicon cell line. Figure 5A: Nucleotide mutations 
that were characterized in each of the adapted replicons are indicated adjacent to 
the respective segment of the replicon (IRES, NS3, NS4A, NS5A, and NS5B). Figure 
5B: Amino acid numbers are numbered according to the full length HCV poly-protein 
with the first amino acid in the second cistron corresponding to amino acid 810 in 

35 NS2 of I377/NS2-3' construct. 
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Figure 6 depicts the colony formation efficiency of four in vitro transcribed HCV sub- 
genpmic bi-cistronic replicon RNAs. The APGK1 2 serves as the reference 
sequence; highlighted are the initiating nucleotides of the HCV IRES in each of the 
5 constructs and the amino acid differences (from the APGK12 reference sequence) in 
the HCV non-structural region for the two R3-rep. Note that the in vitro transcribed 

■ 

APGK-12 RNAs that harbor either a 5'G or 5'A form colonies with the same 
efficiency (ca. 80 cfu/pg in panels A and B) following selection with 0.25 mg/ml 
G418. RNA isolated from the second generation R3 cell line was reverse transcribed 

10 into DNA and cloned into the pAPGK12 vector backbone to generate the R3-rep, 
which was sequenced and found to encode additional changes that included the 
L(2155)P substitution in the NS5A segment of the HCV polyprotein (compare R3-rep 
sequence with the R3 sequence in tables 2 and 3). Various quantities of in vitro 
transcribed R3-rep-5'A RNA, were transfected into naTve Huh-7 cells to determine a 

15 colony formation efficiency of 1 .2 X 10 6 cfu/pg of RNA (panel C). Various quantities 
of R3-rep-5'G were also transfected resulting in a colony formation efficiency of 2 X 
10 6 cfu/pg of RNA (panel D). 

Figure 7 displays a typical RT-PCR amplification plot (left panel) and the graphical 
20 representation of Ct values versus known HCV RNA quantity in a standard curve 
(right panel). Each of the plotted curves in the left panel, graph the increment of 
fluorescence reporter signal (delta-Rn) versus PCR cycle number for a 
predetermined quantity of HCV replicon RNA. The Ct value is obtained by 
determining the point at which the fluorescence exceeds an arbitrary value 
25 (horizontal line). The right panel demonstrates the linear relationship between 

starting RNA copy number of the predetermined standards (large black dots) and the 
Ct value. Smaller dots are the Ct values of RNA samples (containing unknown 
quantity of HCV replicon RNA) from S22.3 cells treated with various concentrations 
of a specific inhibitor of HCV replication. 

30 

Figure 8 shows the effect of increasing concentration of inhibitor A on HCV RNA 
replicon levels in Huh7 cells. S22.3 cells were grown in the presence of increasing 
concentrations of inhibitor A starting at 0.5nM and ranging to 1024nM. The inhibitor 
dose-response curve is the result of 1 1 concentrations from serial two-fold dilutions 
35 (1:1). One control well, without any inhibitor, was also included during the course of 
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the experiment. The cells were incubated for 4 days in a 5% C0 2 incubator at 37 °C. 
Total cellular RNA was extracted, quantified by optical density. HCV replicon RNA 
was evaluated by real time RT-PCR and plotted as genome equivalents/pg total 
RNA as a function of inhibitor concentration 

5 

Definitions 

Unless defined otherwise, the scientific and technological terms and nomenclature 
used herein have the same meaning as commonly understood by a person of 
ordinary skill to which this invention pertains. Generally, the procedures for cell 

10 culture, infection, molecular biology methods and the like are common methods used 
in the art. Such standard techniques can be found in reference manuals such as for 
example Sambrook et a/. (1989) and Ausubel et a/. (1994). 
Nucleotide sequences are presented herein by single strand, in the 5' to 3' direction, 
from left to right, using the one letter nucleotide symbols as commonly used in the art 

15 and in accordance with the recommendations of the IUPAC-IUB Biochemical 
Nomenclature Commission (1972). 

The present description refers to a number of routinely used recombinant DNA 
(rDNA) technology terms. Nevertheless, definitions of selected examples of such 
rDNA terms are provided for clarity and consistency. 

20 The term "DNA segment or molecule or sequence", is used herein, to refer to 

molecules comprised of the deoxyribonucleotides adenine (A), guanine (G), thymine 
(T) and/or cytosine (C). These segments, molecules or sequences can be found in 
nature or synthetically derived. When read in accordance with the genetic code, 
these sequences can encode a linear stretch or sequence of amino acids which can 

25 be referred to as a polypeptide, protein, protein fragment and the like. 

As used herein, the term "gene" is well known in the art and relates to a nucleic acid 
sequence defining a single protein or polypeptide. The polypeptide can be encoded 
by a full-length sequence or any portion of the coding sequence, so long as the 
functional activity of the protein is retained. 

30 A "structural gene" defines a DNA sequence which is transcribed into RNA and 
translated into a protein having a specific structural function that constitute the viral 
particles. "Structural proteins" defines the HCV proteins incorporated into the virus 
particles namely, core "C", E1, E2, and E2-p7. 

"Non-structural proteins", defines the HCV proteins that are not comprised in viral 
35 particles namely, NS2, NS3, NS4A, NS5A and NS5B. 
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"Restriction endonuclease or restriction enzyme" is an enzyme that has the capacity 
to recognize a specific base sequence (usually 4, 5 or 6 base pairs in length) in a 
DNA molecule, and to cleave the DNA molecule at every place where this sequence 
appears. An example of such an enzyme is EcoRI, which recognizes the base 
5 sequence GlAATTC and cleaves a DNA molecule at this recognition site. 

"Restriction fragments" are DNA molecules produced by the digestion of DNA with a 
restriction endonuclease. Any given genome or DNA segment can be digested by a 
particular restriction endonuclease into at least two discrete molecules of restriction 
fragments. 

10 "Agarose gel electrophoresis" is an analytical method for fractionating polynucleotide 
molecules based on their size. The method is based on the fact that nucleic acid 
molecules migrate through a gel as through a sieve, whereby the smallest molecule 
has the greatest mobility and travels the farthest through the gel. The sieving 
characteristics of the gel retards the largest molecules such that, these have the 

15 least mobility. The fractionated polynucleotides can be visualized by staining the gel 
using methods well known in the art, nucleic acid hybridization or by tagging the 
fractionated molecules with a detectable label. All these methods are well known in 
the art, specific methods can be found in Ausubel et al. (supra). 
"Oligonucleotide or oligomer" is a molecule comprised of two or more 

20 deoxyribonucleotides or ribonucleotides, preferably more than three. The exact size 
of the molecule will depend on many factors, which in turn depend on the ultimate 
function or use of the oligonucleotide . An oligonucleotide can be derived 
synthetically, by cloning or by amplification. 

"Sequence amplification" is a method for generating large amounts of a target 
25 sequence. In general, one or more amplification primers are annealed to a nucleic 
acid sequence. Using appropriate enzymes, sequences found adjacent to, or in 
between the primers are amplified. An amplification method used herein is the 
polymerase chain reaction (PCR) and can be used in conjunction with the reverse- 
transcriptase (RT) to produce amplified DNA copies of specific RNA sequences. 
30 "Amplification primer" refers to an oligonucleotide, capable of annealing to a RNA or 
DNA region adjacent to a target sequence and serving as the initiation primer for 
DNA synthesis under suitable conditions well known in the art The synthesized 
primer extension product is complementary to the target sequence. 
The term "domain" or "region" refers to a specific amino acid sequence that defines 
35 either a specific function or structure within a protein. As an example herein, is the 
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NS3 protease domain comprised within the HCV non-structural polyprotein. 
The terms "plasmid" 'Vector" or "DNA construct" are commonly known in the art and 
refer to any genetic element, including, but not limitecl to, plasmid DNA, phage DNA, 
viral DNA and the like which can incorporate the oligonucleotide sequences, or 
5 sequences of the present invention and serve as DNA vehicle into which DNA of the 
present invention can be cloned. Numerous types of vectors exist and are well 
known in the art. 

The terminology "expression vector' defines a vector as described above but 
designed to enable the expression of an inserted sequence fpllowing transformation 

10 ortransfection into a host The cloned gene (inserted sequence) is usually placed 
under the control of control element sequences such as promoter sequences. Such 
expression control sequences will vary depending on whether the vector is designed 
to express the operably linked gene in vitro or in vivo in a prokaryotic or eukaryotic 
host or both (shuttle vectors) and can additionally contain transcriptional elements 

15 such as enhancer elements, termination sequences, tissue-specificity elements, 
and/or translational initiation and termination sites. 

A host cell or indicator cell has been "transfected" by exogenous or heterologous 
DNA (e.g. a DNA construct) or RNA, when such nucleic acid has been introduced 
inside the cell. The transfecting DNA may or may not be integrated (covalently 

20 linked) into chromosomal DNA making up the genome of the cell. In prokaryotes, 
yeast, and mammalian cells for example, the transfecting/transforming DNA may be 
maintained on an episomal element such as a plasmid. With respect to eukaryotic 
cells, an example of a stably transfected cell is one in which the transfecting DNA 
has become integrated into a chromosome and is inherited by daughter cells through 

25 chromosome replication. A host cell or indicator cell can be transfected with RNA. A 
cell can be stably transfected with RNA if the RNA replicates and copies of the RNA 
segregate to daughter cells upon cell division. This stability is demonstrated by the 
ability of the eukaryotic cell to establish cell lines or clones comprised of a population 
of daughter cells containing the transfecting DNA or RNA. Transfection methods are 

30 well known in the art (Sambrook et a/., 1989; Ausubel et a/., 1994). If the RNA 
encodes for a genetic marker that imparts an observable phenotype, such as 
antibiotic resistance, then the stable transfection of replicating RNA can be 
monitored by the acquisition of such phenotype by the host cell. 
As used herein the term 'transduction" refers to the transfer of a genetic marker to 

35 host cells by the stable transfection of a replicating RNA. 
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The nucleotide sequences and polypeptides useful to practice the invention include 
without being limited thereto, mutants, homologs, subtypes, quasi-species, alleles, 
and the like. It is understood that generally, the sequences of the present invention 
encode a polyprotein. It will be clear to a person skilled in the art that the polyprotein 
5 of the present invention and any variant, derivative or fragment thereof, is auto- 
processed to an active protease. 

As used herein, the designation "variant u denotes in the context of this invention a 
sequence whether a nucleic acid or amino acid, a molecule that retains a biological 
activity (either functional or structural) that is substantially similar to that of the 

10 original sequence. This variant may be from the same or different species and may 
be a natural variant or be prepared synthetically. Such variants include amino acid 
sequences having substitutions, deletions, or additions of one or more amino acids, 
provided the biological activity of the protein is conserved. The same applies to 
variants of nucleic acid sequences which can have substitutions, deletions, or 

15 additions of one or more nucleotides, provided that the biological activity of the 
sequence is generally maintained. 

The term "derivative" is intended to include any of the above described variants 

♦ 

when comprising additional chemical moiety not normally a part of these molecules. 
These chemical moieties can have varying purposes including, improving a 

20 molecule's solubility, absorption, biological half life, decreasing toxicity and 

eliminating or decreasing undesirable side effects. Furthermore, these moieties can 
be used for the purpose of labeling, binding, or they may be comprised in fusion 
product(s). Different moieties capable of mediating the above described effects can 
be found in Remington's The Science and Practice of Pharmacy (1 995). 

25 Methodologies for coupling such moieties to a molecule are well known in the art. 
The term "fragment" refers to any segment of an identified DNA, RNA or amino acid 
sequence and/or any segment of any of the variants, or derivatives described herein 
above that substantially retains its biological activity (functional or structural) as 
required by the present invention. 

30 The terms "variant" , "derivative", and "fragment" of the present invention refer herein 
to proteins or nucleic acid molecules which can be isolated/purified, synthesized 
chemically or produced through recombinant DNA technology. All these methods 
are well known in the art. As exemplified herein below, the nucleotide sequences 
and polypeptides used in the present invention can be modified, for example by in 

35 vitro mutagenesis. 
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As used herein, the term "HCV polyprotein coding region" means the portion of a 
hepatitis C virus that codes for the polyprotein open reading frame (ORF). This ORF 
may encode proteins that are the same or different than wild-type HCV proteins. The 
ORF may also encode only some of the functional protein encoded by wild-type 
5 polyprotein coding region. The protein encoded therein may also be from different 
isolates of HCV, and non-HCV protein may also be encoded therein. 

As used herein, the abbreviation "NTR" used in the context of a polynucleotide 
molecule means a non-translated region. The term "UTR" means untranslated 
10 region. Both are used interchangeably. 

* 

Preferred embodiments 

Particularly, the invention provides a HCV self-replicating polynucleotide molecule 
15 comprising a 5-temninus consisting of ACCAGC (SEQ ID NO.8). 

According to the first embodiment of this invention, there is particularly provided a 
HCV polynucleotide construct comprising: 

- a 5-non translated region (NTR) comprising the sequence ACCAGC at, or 
20 proximal to, its 5-terminus; 

- a HCV polyprotein coding region; and 

- a 3-NTR region. 

In a second embodiment, the present invention is directed to a HCV self-replicating 
25 polynucleotide encoding a polyprotein comprising one or more amino acid 

substitution selected from the group consisting of: R(1 135)K; S(1 148)G; S(1560)G; 
K(1691)R; L(1701)F; l(1984)V; T(1993)A; G(2042)C; G(2042)R; S(2404)P; 
L(2155)P; P(2166)L and M(2992)T. 

30 Particularly, the invention is directed to a HCV self-replicating polynucleotide 

encoding a polyprotein comprising the any one of the amino acid substitutions as 
described above, further comprising the amino acid substitution E(1202)G. 

Alternatively, the first embodiment of the present invention is directed to HCV self- 
35 replicating polynucleotide molecule comprising a G2042C/R mutation. 
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According to the second embodiment, the present invention particularly provides a 
HCV polynucleotide construct comprising: 

- a 5-NTR region comprising the sequence ACCAGC at, or proximal to, its 5- 
5 terminus; 

- a HCV polyprotein region coding for a HCV polyprptein comprising a 
G(2042)C or a G(2042)R mutation; and 

- a 3-NTR region. 

10 Preferably, the polynucleotide construct of the present invention is a DNA or RNA 
molecule. More preferably, the construct is a RNA molecule. Most preferably, the 
construct is a DNA molecule. 

More particularly, the first embodiment of this invention is directed to a RNA 
15 molecule encoded by the DNA molecule selected from the group consisting of: SEQ 
ID NO. 2, 4, 5, 6, 7, 24 and 25. 

Most particularly, the invention provides a DNA molecule selected from the group 
consisting of: SEQ ID NO. 2, 4, 5, 6, 7, 24 and 25. 

20 

In a third embodiment, the invention also is directed to an expression vector 
comprising DNA forms of the above polynucleotide, operably linked with a promoter. 

Preferably, the promoter is selected from the group consisting of: T3, T7 and SP6. 

25 

According to a fourth embodiment, there is provided a host cell transfected with the 
self-replicating polynucleotide or vector as described above. Particularly, the host 
cell is a eukaryotic cell line. More particularly, the eukaryotic cell line is a hepatic cell 
line. Most particularly, the hepatic cell line is Huh-7. 
30 - 

In a fifth embodiment, the present invention provides a RNA replication assay 
comprising the steps of: 

a) incubating the host cell as described above under conditions suitable for 

RNA replication; 

35 b) isolating the total cellular RNA from the cells; and 
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c) analyzing the RNA so as to measure the amount of HCV RNA replicated. 

Preferably, the analysis of RNA levels in step c) is carried out by amplifying the RNA 
by real-time RT-PCR analysis using HCV specific primers so as to measure the 
5 amount of HCV RNA replicated. 

Alternatively in this fifth embodiment, the construct comprises a reporter gene, and 
the analysis of RNA levels in step c) is earned out by assessing the level of reporter 
expressed. 

10 

According to a preferred aspect of the sixth embodiment, the invention is directed to 
a method for testing a compound for inhibiting HCV replication, including the steps 
of: 

a) carrying step a) as described in the above assay, in the presence or 
15 . absence of the compound; 

b) isolating the total cellular RNA from the cells; and 

c) analyzing the RNA so as to measure the amount of HCV RNA replicated. 

d) comparing the levels of HCV RNA in cells in the absence and presence of 
the inhibitor, 

20 wherein reduced RNA levels is indicative of the ability of the compound to inhibit 
replication. 

Preferably, the cell line is incubated with the test compound for about 3-4 days at a 
temperature of about 37°C. 

25 

EXAMPLES 

Example 1 

Replicon Constructs (APGK-1 2; Figure 1 ) 

30 pET9a-EMCV was obtained by ligating an oligonucleotide linker 

5' gaattccagatggcgcgcccagatgttaaccagatccatggcacactctagagtactgtcgac 3' (SEQ ID 
NO.9) to pET-9a (Novagen) that was cut with EcoRI and Sail to form the vector pET- 
9a-mod. This linker contains the following restriction sites: EcoRI, AscI, Hpal, Ncol, 
Xbal, Seal, Sail. The EMCV IRES was amplified by PCR from the vector pTM1 with 

35 primers 
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5' cggaatcgttaacagaccacaacggtttccctc 3' (SEQ ID NO.10) and 5' 
ggcgtacccatggtattatcgtgtttttca 3' (SEQ ID IM0.11) and ligated into pET-9a-mod via 
EcoRI and Ncol to form pET-9a-EMCV. 

5 The sequence of HCV NS2 to NS5B followed by the 3'UTR of HCV was obtained 
from the replicon construct I377/NS2-3' (Lohman et a/., 1999; accession number 
AJ242651) and synthesized by Operon Technologies Inc. with a T to C change at 
the Ncol site in NS5B at nucleotide 8032. This sequence was released from an 
GenOp® vector (Operon Technologies) with Ncol and Seal and transferred into pET- 
1 0 9a-EMCV to form pET-9a-EMCV-NS2-5B-3'UTR. 

pET-9a-HCV-neo was obtained by amplification of the HCV IRES from a HCV cDNA 
isolated from patient serum with primers 

5* gcatatgaattctaatacgactcactataggccagcccccgattg 3' (SEQ ID NO/I2) containing a 
15 T7 promoter and primer 

5* ggcgcgccctttggtttttctttgaggtttaggattcgtgctcat 3' (SEQ ID NO.13) and amplification 
of the neomycin phosphotransferase gene from the vector pcDNA 3.1 (Invitrogen) 
with primers 

5' aaagggcgcatgattgaacaagatggattgcacgca 3' (SEQ ID NO.14) and 5' 
20 gcatatgttaactcagaagaactcgtcaagaaggcgata 3' (SEQ ID NO.15). These two PCR 
fragments were mixed and amplified with primers 

5' gcatatgaattctaatacgactcactataggccagcccccgattg 3' (SEQ ID NO.16) and 
5* gcatatgttaactcagaagaactcgtcaagaaggcgata 3' (SEQ ID N0.15)i cut with Eco Rl 
and Hpal and transferred into pET-9a-mod to form pet-9a-HCV-neo. The EMCV- 
25 NS2-5B-3'UTR was released from pET-9a-EMCV-NS2-5B-3'UTR with Hpal and 
Seal and transferred into pet-9a-HCV-neo that was cut with Hpal to form pET-9a- 
APGK12. This insert was sequenced with specific successive primers using a ABI 
Prism® BigDye™ Terminator Cycle sequencing kit and analyzed on ABI Prism® 377 
DNA Sequencer and is shown in SEQ ID N0 1 . 

30 

RNA in vitro transcription 

pET-9a-APGK12 DNA was cut with Seal for expression of the full-length replicon or 
with Bglll for expression of a truncated negative control RNA. DNA was analyzed on 
a 1 % agarose gel and purified by Phenol/Chloroform extraction. RNA was produced 
35 using a T7 Ribomax® kit (Promega) followed by extraction with phenol/chloroform 
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and precipitation with 7.5 M LiCI 2 . RNA was treated with DNAse I for 15 min to 

remove the DNA template and further purified with an RNeasy® column (Qiagen). 
RNA integrity was verified on a denaturing formaldehyde 1% agarose gel. 

5 Example 2 

Primary transfection of Huh7 cells and selection of replicon cell lines 

Human hepatoma Huh7 cells (Health Science Research Resources Bank, Osaka, 
Japan) were grown in 10% FBS/DMEM. Cells were grown to 70% confluence 
trypsinized, washed with phosphate buffered saline (PBS) and adjusted to 1x1 0 7 
10 cells/ml of PBS. 800 |il of cells were transferred into 0.4cm cuvettes and mixed with 
15 ^g of replicon RNA. Cells were electroporated using 960^F, 300 volts for -18 
msec and evenly distributed into two 15 cm tissue culture plates and incubated in a 
tissue culture incubator for 24 hours. The selection of first and second generation 
replicon cell lines was with 10% FBS/DMEM medium supplemented with 1mg/ml of 
15 G41 8. Cells were selected for 3-5 weeks until colonies were observed that were 
isolated and expanded. 

Following the G418 selection and propagation of Huh-7 cells transfected with 
APGK12 (SEQ ID NO. 1) RNA, cells that formed a distinct colony were" treated with 
trypsin and serially passed into larger culture flasks to establish cell lines. 
Approximately 10 X 10 6 cells were harvested from each cell line. The ceils were 
lysed and the total cellular RNA extracted and purified as outlined in Qiagen 
RNAeasy® preparatory procedures. Figure 2 shows the analysis of 12 \ig of total 
cellular RNA from various cell lines as analyzed on a Northern blot of a denaturing 
agarose-formaldehyde gel. 

■ 

Figure 2A is a Northern blot (radioactively probed with HCV specific minus-strand 
RNA) that detects the presence of plus-strand replicon RNA. Lanes 1 and 2 are 
positive controls that contain 10 9 copies of in vitro transcribed APGK12 RNA. Lane 2 
contains the in vitro transcribed RNA mixed with 12 pg of total cellular from naive 
Huh-7 cells. Lane 3 is a negative control of total cellular RNA from untreated Huh-7 
cells. Lanes 4 and 5 contain cellular RNA from the B1 and B3 G418 resistant cell 
lines that have DNA integrated copies of the neomycin phosphotransferase gene. 
Lane 6 contains total cellular RNA from a Huh-7 cell line, designated S22.3, that 



20 



25 



30 



WO 02/052015 



PCT/CA01/01843 



19 

harbors high copy number of HCV sub-genomic replicon RNA as detected by the 
positive signal in the 8 kilo-base range. Other cell lines have no detectable replicon 
RNA. Figure 2B is a Northern blot of a duplicate of the gel presented in 2A with the 
exception that the blot was radioactively probed with HCV specific plus-strand RNA 
5 to detect the presence of HCV minus-strand RNA (lanes 1 and 2 are positive control 
lanes that contain 10 9 copies of full length genomic HCV minus strand RNA); only 
lane 6, which contains 12 ng of total cellular RNA from cell line S22.3, harbors 
detectable minus-strand replicon RNA at the expected size of 8 - 9 kilobases. An 
quantitative estimation of RNA copy number, based on phosphorimager scanning of 
10 the Northern blots, is approximately 6 X1 0 7 copies of plus-strand/fig of total RNA, 
and 6 x 10 6 copies of minus strand/ \ig of total RNA. The presence of the plus-strand 
and minus-strand intermediate confirms that the HCV sub-genomic RNA is actively 
replicating in the S22.3 cell line. 

15 Example 3 

S22.3 cell line constitutively expresses HCV non-structural proteins. 

HCV non-structural protein expression was examined in the S22.3 cell line. Figure 3 
displays the result of indirect immunofluorescence that detects the HCV NS4A 

20 protein in the S22.3 cell line and not in the replicon negative B1 cell line (a G418 
resistant Huh-7 cell line). Indirect immunofluorescence was performed on cells that 
were cultured and fixed (with 4% paraformaldehyde) onto Lab-tek chamber slides. 
Cells were permeabilized with 0.2% Triton X-100 for 1 0 minutes followed by a 1 hour 
treatment with 5% milk powder dissolved in phosphate-buffered saline (PBS). A 

25 rabbit serum containing polyclonal antibody raised against a peptide spanning the 
HCV NS4A region was the primary antibody used in detection. Following a 2 hour 
incubation with the primary antibody, cells were washed with PBS and a secondary 
. goat anti-rabbit antibody conjugated with red-fluor Alexa® 594 (Molecular Probes) 
was added to cells for 3 hours. Unbound secondary antibody was removed with PBS 

■ 

30 washes and cells were sealed with a cover slip. Figure 3 (top panels) shows the 

results of immunofluorescence as detected by a microscope with specific fluorescent 
filtering; the bottom panels represent the identical field of cells viewed by diffractive 
interference contrast (DIG) microscopy. The majority of S22.3 (Figure 3A) cells within 
the field stain positively for HCV NS4A protein that localizes in the cytoplasm, 

35 whereas the B1 cells (Figure 3B) that fail to express any HCV proteins, only have 
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background level of staining. A small proportion of S22.3 cells express high levels of 
intensely stained HCV NS4A. 

Expression of the proteins encoded by the bi-cistronic replicon RNA was also 
5 examined on Western-blots following SDS-PAGE separation of total proteins 
extracted from: (i) naive Huh-7 cell line, (ii) neomycin resistant Huh-7 cell line B1, 
and (iii) the S22.3 cell line. Figure 4 panels A, B, and C, demonstrate the results of 
western blots probed with rabbit polyclonal antisera specific for neomycin 
phosphotransferase (NPT), HCV NS3, and HCV NS5B, respectively. Visualization 

10 was achieved through autoradiographic detection of a chemiluminescent reactive 
secondary HRP-conjugated goat anti-rabbit antibody. Figure 4. panel A shows that 
the S22.3 RNA replicon cell line, expresses the NPT protein at levels higher than B1 
cells (which contain an integrated DNA copy of the npt gene) and that the naive 
Huh-7 cell line does not produce the NPT protein. Figure 4 panels B and C show 

15 that only the S22.3 cell line produces the mature HCV NS3 and NS5B proteins, 
respectively. The western blots demonstrate that the S22.3 cell line, which harbors 
actively replicating HCV sub-genomic replicon RNA, maintains replication of the 

■ 

RNA through the high level expression of the HCV non-structural proteins. 

■ 

20 Example 4 

Sequence determination of adapted repiicons 

* 

Total RNA was extracted from replicon containing Huh7 cells using a RNeasy Kit 
(Qiagen). Replicon RNA was reverse transcribed and amplified by PGR using a 

25 OneStep RT-PCR kit (Qiagen) and HCV specific primers (as selected from the full- 
length sequence disclosed in WO 00/66623). Ten distinct RT-PCR products, that 
covered the entire bi-cistronic replicon in a staggered fashion, were amplified using 
oligonucleotide primers. The PCR fragments were sequenced directly with ABI 
Prism® BigDye™ Terminator Cycle PCR Sequencing and analyzed on ABI Prism® 

30 377 DNA Sequencer. To analyze the sequence of the HCV replicon 3' and 5' ends a 
RNA ligation/RT-PCR procedure described in Kolykhalov et si 1996 was followed. 
The nucleotide sequence of S22.3 is presented as SEQ ID NO. 2. 
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Example 5. 

Serial Passage of HCV Replicon RNA 



The total cellular RNA from the S22.3 cell line was prepared as described above. 
5 HCV Replicon RNA copy number was determined by Taqman® RT-PCR analysis 
and 20 \ig of total S22.3 cellular RNA (containing 1 X 1 0 9 copies of HCV RNA) was 
transfected by electroporation into 8 X 10 6 naTve Huh-7 cells. Transfected cells were 
subsequently cultured in 10 cm tissue culture plates containing DMEM 
supplemented with 10% fetal calf serum (10% FCS>. Media was changed to DMEM 

10 (10% FCS) supplemented with 1 mg/ml G418 24 hours after transfection and then 
changed every three days. Twenty-three visible colonies formed three to four weeks 
post-transfection and G418 selection. G418 resistant colonies were expanded into 
second generation cell lines that represent the first cell lines harboring serially 
passaged HCV Replicon RNA. Three of these cell lines: R3, R7, and R16 were the 

1 5 subject of further analyses. First, the efficiency of transduction by each of the 
adapted replicons was determined by electroporation of the total cellular RNA 
(Extracted from the R3, R7 and R16) into naive Huh-7 cells; following 
electroporation, the transduction efficiency was determined as described above, by 
counting the visible G418 resistant colonies that arose following 3 to 5 weeks of 

20 G418 selection (Table 1). Second, the sequence of the serially passed adapted 

replicons was determined from the total cellular RNA that was extracted from each of 
the R3, R7 and R16 replicon cell lines as described in example 4 (SEQ ID NO. 4, 5, 
6). Using the pAPGK12 as a reference sequence (SEQ ID NO. 1), the nucleotide 
changes that were selected in HCV segment of the adapted replicons are presented 

25 in Figure 5A. Some of these nucleotide changes are silent and do not change the 
encoded amino acid whereas others result in an amino acid substitution. Figure 5B 
summarizes the amino acid changes encoded by the adapted replicons with the 
amino acid sequence of pAPGKI 2 as the reference. It is important to note that the 
reference sequence APGK-12 (SEQ ID NO.1) contains an extra G at the 5-terminal 

30 (5-GG) that is not maintained in the replicating RNA of the established cell lines. 
Also noteworthy is that, in addition to G->A at nucleotide 1 , there is also an adapted 
mutation G->C/R at amino acid 2042 (shown as amino acid 1233 in the sequence 
listing since a.a. 810 of NS2 is numbered as a.a. 1 in SEQ ID) that can be found in 
all clones analyzed. 



35 



WO 02/052015 



PCT/CA01/01843 



22 

TABLE 1 
Transfection of Huh-7 cells 



RNA 



Copies of Replicon # Colonies SEQ ID 



5ngAPKG12replicon 
in 20ng total Huh-7 RNA 

15pgAPKG12 
10 replicon RNA 

20ng total: 

S22.3 cellular RNA 

15 R3 cellular RNA 

R7 cellular RNA 
R16 cellular RNA 
cloned R3rep RNA 



1.2x10 



9 



3x10 



12 



3x10 



1 x10 

1 x10* 
3x10 



9 



8 



2.3x10 



8 



1 (S22.3) 



23 (3 clones 
analyzed) 

200 
20 
100 
2000 



1 



4 

5 
6 
7 



20 Example 6 

Construction of APGK12 with 5" G-> A substitution (APGK12-5'A, SEQ ID 
NO.24) 

The pAPGKI 2 DNA was modified to change the first nucleotide in the sequence to 
replace the 5'GG with a 5'A. The change in the pAPGK12 was introduced by 
25 replacing an EcoR\/Age\ portion of the sequence with a PCR-generated EcoR\IAge\ 

m 

fragment that includes the mutation. The oligonucleotides used for the amplification 
were (SEQ ID. NO. 20): 5'-GTG GAC GAA TTC TAA TAC GAC TCA CTA TAA CCA 
GCC CCC GAT TGG-3' and (SEQ ID. NO. 21): 5'-GGA ACG CCC GTC GTG GCC 
AGC CAC GAT-3' and generated a 195 bp DNA fragment that was then digested 
30 with EcoR\ and AgeL The resulting 178 bp restriction fragment was used to replace 
the EcoR\IAge\ fragment in pAPGK12 to generate the pAPGK12-5'A plasmid. 



Example 7 

CDNA CLONING OF THE R3-REPLICON (R3REP). 

35 The cDNA clone of the R3 replicon was produced by RT-PCR of RNA extracted from 
the R3 cell line. The following two oligonucleotides were used: (SEQ ID. NO. 22): 5- 
GTC GTC TTC TCT GAC ATG GAG AC-3 1 and (SEQ ID. NO. 23): 5'-GAG TTG 
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CTC AGT GGA TTG ATG GGC AGC-3\ The ~4400nt PCR fragment, starting 
within the NS2 coding region and extending to the 5'-end of the NS5B coding region, 
was cloned into the plasmid pCR3.1 by TA cloning (Invitrogen). The SacWIXhol 
portion of this R3 sequence was then used to replace the Sacll IXhol fragment 
5 present in the pAPGKI 2 and the pAPGKI 2-5'A described above. Consequently, 
two R3 cDNA sequences were generated: (I) R3-Rep-5'G with an initiating 5'G (SEQ 
ID NO.7), and R3-Rep-5'A (SEQ ID NO.25) with an initiating 5'A. Sequencing of the 
R3 rep cDNA identified unique nucleotide changes that differ from the original 
pAPGKI 2 sequence (see Figure 5A); some of these changes are silent and do not 
10 change the encoded amino acid, whereas others do result in an amino acid change 
(see Figure 5B). The differences between R3 and the R3-rep reflect the isolation of a 
unique R3-rep cDNA clone encoding nucleotide changes that were not observed 
from the sequencing of the total RNA extracted from the R3 cell line. 

15 Example 8 

Efficiency of colony formation with modified constructs 
RNA from pAPGKI 2, pAPGKI 2-5'A, pR3-Rep and pR3-Rep-5'A was generated by 
in vitro transcription using the T7 Ribomax® kit (Promega) as described in example 
1 above. The reactions containing the pAPGKI 2-5'A and pR3-Rep-5'A templates 

20 were scaled-up 10-fold due to the limitation of commercial RNA polymerase in 
initiating transcripts with 5-A. The full length RNAs and control truncated RNA for 
each clone were introduced into 8 x 10 6 naTve Huh-7 cells by electroporation as 
described in example 2. Replicon RNA was supplemented with total cellular Huh-7 
carrier RNA to achieve a final 15-20pg quantity. The cells were then cultured in 

25 DMEM medium supplemented with 10% fetal calf serum and 0.25 mg/ml G418 in 
two 150 mm plates. The lower concentration of G418 was sufficient to isolate and 
select replicon containing cell lines as none of the transfectants with the control 
truncated RNA produced any resistant colonies. In contrast, in vitro transcribed 
APGK-12 RNAs that harbor either a 5'G or 5'A form colonies with the same 

30 efficiency (ca. 80 cfu/pg in Figure 6 panels A and B) following selection with G41 8. 
Various quantities (ranging from 0.1 ng to 1 pg) of the R3-rep-5'A RNA, were 
transfected into naive Huh-7 cells to determine a colony formation efficiency of 1 .2 X 
10 6 cfu/pg of RNA (Figure 6 panel C depicts transfection with 1 pg of RNA). Various 
quantities (ranging from 0.1 ng to 1 pg) of R3-rep [5'G] were similarly transfected 

35 resulting in a colony formation efficiency of 2 X 10 6 cfu/pg of RNA (Figure 6 panel D 
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depicts colony formation with 1pg of RNA). Note that, shown for the first time, HCV 
subgenomic replicons replicate as efficiently with a 5' A nucleotide in place of the 
5'G. APGK12 with a 5'A or 5'G RNA have similar transduction efficiencies. Similarly, 
R3-Rep RNAs with either the 5'A or 5'G both display the markedly increased 
5 transduction efficiency. Notably, the adaptive mutants within the HCV non-structural 
segment encoded by the R3-Rep provides for a substantial increase in transduction 
efficiency as depicted by the dramatic increase in colony forming units per pg of 
transfected RNA. 

10 Example 9 

Quantification of HCV Replicon RNA Levels in Cell lines 

S22.3 cells, or cell lines harboring other adapted replicons, were seeded in DMEM 
supplemented with 1 0% FBS, PenStrep and 1 \ig/wL Geneticin. At the end of the 

15 incubation period the replicon copy number is evaluated by real-time RT-PCR with 
the ABI Prism 7700 Sequence Detection System. The TAQMAN® EZ RT-PCR kit 
provides a system for the detection and analysis of HCV RNA (as first demonstrated 
by Martell et a/. 1999 J. Clin. Microbiol. 37: 327-332). Direct detection of the reverse 
transcription polymerase chain reaction (RT-PCR) product with no downstream 

20 processing is accomplished by monitoring the increase in fluorescence of a dye- 
labeled DNA probe (Figure 6). The nucleotide sequence of both primers (adapted 
from Ruster, B. Zeuzem, S. and Roth, W.K., 1995. Analytical Biochemistry 224:597- 
600) and probe (adapted from Hohne, M., Roeske, H. and Schreier, E. 1998, Poster 
Presentation: P297 at the Fifth International Meeting on Hepatitis C Virus and 

25 Related Viruses Molecular Virology and Pathogenesis, Venezia-Lido Italy, June 25- 
28, 1998) located in the 5-region of the HCV genome are the following: 

HCV Forward primer. 

5' ACG CAG AAA GCG TCT AGC CAT GGC GTT AGT 3' (SEQ ID NO.17) 

30 

HCV Reverse primer 

5' TCC CGG GGC ACT CGC AAG CAC CCT ATC AGG 3' (SEQ ID NO.18) 
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HCV Probe: 

5' FAM-TGG TCT GCG GAA CGG GTG AGT ACA CC-TAMRA 3' (SEQ ID 

NO.19) 

5 FAM: Fluorescence reporter dye. 
TAMRA: Quencher dye. 



Using The TAQMAN® EZ RT-PCR kit, the following reaction was set up: 



Component 


Volume per sample 


Final 




(pL) 


Concentration 


RNase-Free Water 


16 




5X Taqman EZ Buffer 


10 


1X 


Manganese Acetate 25mM 


6 


3mM 


dATP 10mM 


1.5 


300pM 


dCTP 10mM 


1.5 


300pM 


dGTP 1 0mM 


1.5 


300uM 


dUTP 20mM 


1.5 


300uM 


HCV Forward Primer 10pM 


1 


200nM 


HCV Reverse Primer 1 0uM 


1 


200nM 


HCV Probe 5uM 


2 


200nM 

• 


rTth DNA Polymerase 


2 


. 0.1U/pL 


2.5U/uL 






AmpErase UNG 1 U/|jL 


0.5 


0.01 U/pL 


Total Mix 


45 





To this reaction mix, 5jiL of total RNA extracted from S22.3 cells diluted at lOng/^L 
was added, for a total of 50ng of RNA per reaction. The replicon copy number was 
evaluated with a standard curve made from known amounts of replicon copies 
(supplemented with 50ng of wild type Huh-7 RNA) and assayed in an identical 
15 reaction mix (Figure 7). 

Thermal cycler parameters used for the RT-PCR reaction on the ABI Prism 7700 
Sequence Detection System were optimized for HCV detection: 
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Cycle 


Temperature (°C) 


Time (Minutes) 


Repeat 


Reaction 


Hold 
Hold 

Hold 
Cycle 

Cycle 


50 
60 

95 
95 
60 
90 
60 


2 
30 

5 

0:15 
1 

0:15 
1 


2 

40 


Initial Step 
Reverse 
Transcription 
UNG Deactivation 
Melt 

Anneal/Extend 
Melt 

Anneal/Extend 



Quantification is based on the threshold cycle, where the amplification plot crosses a 
defined fluorescence threshold. Comparison of the threshold cycles provides a 
highly sensitive measure of relative template concentration in different samples. 
5 Monitoring during early cycles, when PCR fidelity is at its highest, provides precise 
data for accurate quantification. The relative template concentration can be 
converted to RNA copy numbers by employing a standard curve of HCV RNA with 
known copy number (Figure 7). 

10 Example 10 

A specific HCV NS3 protease anti-viral compound inhibits replication of the 
HCV replicon in S22.3 cell lines. 

In order to determine the effect of a specific HCV NS3 protease anti-viral compound 

« 

15 on replicon levels in S22.3 cells, the cells were seeded in 24 Well Cell Culture 
Cluster at 5 X 10 4 cells per well in 500jiL of DMEM complemented with 10% FBS, 
PenStrep and 1fig/mL Geneticih. Cells were incubated until compound addition in a 
5% C0 2 incubator at 37 °C. The dose-response curve of the inhibitor displayed 1 1 
concentrations resulting from serial two-fold dilutions (1:1). The starting 

20 concentration of compound A was 100nM. One control well (without any compound) 
was also included in the course of the experiment. The 24 well plates were 
incubated for 4 days in a 5% C0 2 incubator at 37 °C. Following a 4 day incubation 
period, the cells were washed once with PBS and RNA was extracted with the 
RNeasy® Mini Kit and Qiashredder® from Qiagen. RNA from each well was eluted 

25 in 50uL of H 2 0. The RNA was quantified by optical density at 260nm on a Gary 1 E 

> * 

UV-Visible Spectrophotometer. 50 ng of RNA from each well was used to quantify 
the HCV replicon RNA copy number as detailed in Example 6. The level of inhibition 
(% inhibition) of each well containing inhibitor was calculated with the following 
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equation (CN = HCV Replicon copy number): 



10 



15 



% • inhibition = | CN^control-CN^ell ^ m 

CN - control 



5 The calculated '% inhibition values were then used to determine IC 50f slope factor (n) 
and maximum inhibition (l^ by the non-linear regression routine NUN procedure of 
SAS using the following equation: 



%. inhibition- 1 inhibitor]" 



[inhibitor]" + ICso" 



Compound A was tested in the assay at least 4 times. The IC 50 curves were 
analyzed individually by the SAS nonlinear regression analysis. Figure 8 shows a 
typical curve and Table 2 shows the individual and average IC 50 values of compound 
A. The average IC50 of compound A in the replication assay was 1 .1nM. 

* 

TABLE 2 

IC 50 of compound A in the S22.3 Cell line Replicon Assay. 



Compound IC 50 (nM) Average IC 50 (nM) 

1.2 
1.2 
1.0 
0.9 

1.1 ±0.2 



20 Discussion 

The reproducible and robust ex vivo propagation of hepatitis C virus, to levels 
required for the accurate testing of potential anti-viral compounds, has not been 
achieved with any system. As an alternative approach to studying the molecular 
mechanisms of hepatitis C virus RNA replication, selectable self-replicating bi- 
25 cistronic RNAs were developed (Lohman et a/., 1999, Science 285:110-113; 

Bartenschlager CA 2,303,526). Minimally, these replicons encode for some or all of 
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the non-structural proteins and also carry a selectable marker such as the neomycin 
phosphotransferase. Though intracellular steady-state levels of these sub-genomic 
replicon RNAs among the selected clones is moderate to high, the frequency of 
generating G418-resistant colonies upon transfection of the consensus RNA 
5 described by Lohman et a/, or Bartenschlager is very low. Less than 100 colonies 
are generated when 8 million cells are transfected with 1 pg of in vitro transcribed bi- 
cistronic replicon RNA. A low efficiency of colony formation was first noted by 
Lohmann et al (1 999 et al, Science 285:1 1 0-1 1 3). Since then, Lohmann et al. 
(2001), Blight et a/. (2000), and Guo et al (2001), have isolated sub-genomic RNAs 

10 with markedly improved efficiencies in the colony formation assay. Lohmann et a/., 
1999 originally reported that selection of sub genomic replicons may not involve the 
selection of adaptive mutants as serially passaged RNA did not demonstrate an 
improved transfection efficiency. Nevertheless, in an effort to characterize the 
function and fitness of replicating HCV RNA, we serially passaged the replicon RNA 

15 that was isolated from the first selected cell-line. Notably, a significant increase in 
colony forming efficiency was obtained from this experiment, even though the 
quantity of replicon RNA was orders of magnitude lower than originally used to 
transfect the in vitro transcribed RNA. Furthermore, a second round serial passage 
of replicon RNA from this first generation clone into naive Huh-7 cells provided for 

20 yet another increase in colony formation efficiency (Table 1). 

Our analysis of replicating HCV RNAs identified several adaptive mutations that 
enhance the efficiency of colony formation by up to 4 orders of magnitude. Adaptive 
mutations were found in many non-structural proteins, as well as in the 5* non- 
25 translated region. The substitution of the 5'-GG doublet for a 5'-A as the inaugurating 
nucleotide of the HCV 5-UTR is a variant of the HCV genome that has not been 
previously described, despite the sequencing of innumerable genotypes and 
subtypes from across the world. Our original replicon that carried a 5-GG evolved to 
variants with either a single 5-A or 5-G, both of which showed equal transduction 
30 efficiency. We describe here the first report of a HCV genome that can tolerate and 
stably maintain a 5'A extremity. Moreover, we were successful in re-introducing this 
defined single nucleotide substitution into our cDNA clone and generate in vitro 
transcribed RNA harboring such an extremity to confirm that a 5'A functions as 
efficiently as a 5'G. 



35 
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We have identified adaptive amino acid substitutions in the HCV non-structural 
proteins NS3, NS4A and NS5A in the R3 replicon, and a substitution in NS5B in the 
R7 clone (see Figure 5B). These mutations, particularly the combination defined by 
the R3-rep (SEQ ID NO. 7), when reconstituted into a cDNA clone and transcribed 
5 onto a RNA replicon, result in a significantly enhanced transduction efficiency of up 
to 20,000 fold from the original wild type APGK12 replicon RNA. However, the 
steady state levels of intracellular replicon RNA were comparable from each of the 
different isolated clones. This result suggests that the increase in replication 
efficiency by the adaptive mutations does not result in higher stable intracellular RNA 
10 levels due to higher RNA replication, but rather confers increased permissivity for 
establishing the replicon in a greater number of Huh7 cells. Such a phenotype may 
be manifested transiently, through an initial increase of the amount of de novo 
replication, that is required to surpass a defined threshold to establish persistently 
replicating RNAs within a population of dividing cells. 

15 

Recently three other groups also identified other distinct adaptive mutants. Lohmann 
ef a/. (2000) reported enhanced transduction efficiencies of up to 10,000 fold with 
mutations in NS3, NS4B, NS5A and NS5B. Blight et al. (2000) reported an 
augmentation of transduction efficiencies up to 20,000 fold with a single mutation in 

20 NS5A whereas Guo et al. (2001) reported increases in transduction efficiencies of 
5,000-10,000 fold with a deletion of a single amino acid in NS5A. The amino acid 
substitutions that we describe here have not previously been identified as adaptive 
mutants that enhance the efficiency of RNA transfection and/or replication. One 
exception is the mutation of E1202G in NS3 that we found in both the R7 and R16 

25 replicons. This adaptation was previously described by Guo et al (2001) and Krieger 
et al (2001). All other adaptive mutations, without exception, described herein are 
unpublished. 

The development of selectable subgenomic HCV replicons has provided for potential 
30 avenues of exploration on HCV RNA replication, persistence, and pathogenesis in 
cultured cells. However, the low transduction efficiency with the HCV RNA- 
containing replicons as originally described (Lohmann et a/., 1999) showed that it 
was not a practical system for reverse genetics studies. The adaptive mutants 
described herein overcome the low transduction efficiency. In light of the recent 
35 descriptions of adaptive mutants by other groups, we note that adaptation can be 
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achieved by distinct mutations in different HCV NS proteins, although the level of 
adaptation can vary drastically. The replicons encoding adaptive mutants that are 
described herein are ideally suited for reverse genetic studies to identify novel HCV 
targets or host cell targets that may modulate HCV RNA replication or HCV replicon 
5 RNA colony formation. The adapted and highly efficient replicons are suitable tools 
for characterizing subtle genotypic or phenotypic changes that affect an easily 
quantifiable transduction efficiency. 

Lastly, we have used our adapted HCV sub genomic replicon celMine to 
10 demonstrate the proficient inhibition of HCV RNA replication by a specific small 
molecule inhibitor of the HCV NS3 protease. This is the first demonstration that an 
antiviral, designed to specifically inhibit one of the HCV non-structural proteins, 
inhibits HCV RNA replication in cell culture. Moreover, this compound and our S22.3 
cell line validate the proposal that RNA replication is directed by the HCV non- 
15 structural proteins NS3 to NS5B. The assay that we have described and validated 
will be extremely useful in characterizing other inhibitors of HCV non-structural 
protein function in cell culture in a high throughput fashion. 

All references found throughout the present disclosure are herein incorporated by 
20 reference whether they be found in the following list or not. 
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CLAIMS 

* 

1. A HCV polynucleotide molecule comprising a 5-non translated region (NTR) 
wherein guanine at position 1 is substituted for adenine. 

2. A HCV self-replicating polynucleotide comprising: 

- a 5'-NTR consisting of ACCAGC (SEQ ID NO. 8); 

- a HCV polyprotein region coding for a HCV polyprotein; and 

- a 3-NTR region. 

3. The HCV polynucleotide according to claim 2, wherein said polyprotein comprises 
one or more amino acid substitution selected from the group consisting of: 
R(1135)K; S(1148)G; S(1560)G; K(1691)R; L(1701)F; l(1984)V; T(1993)A; 
G(2042)C; G(2042)R; S(2404)P; L(2155)P; P(2166)L and M(2992)T. 

4. The HCV polynucleotide encoding a polyprotein comprising one or more of the 
amino acid substitution as defined in claim 3, and further comprising the amino acid 
substitution E(1202)G. 

5. The HCV polynucleotide according to claim 3, wherein said substitution is a 
G2042C or a G2042R mutation. 

6. The HCV polynucleotide according to claim 3, wherein said substitution is selected 
from the group consisting of: K(1691)R; and G(2042)C. 

7. The HCV polynucleotide according to claim 3, wherein said substitution is selected 
from the group consisting of: R(1135)K; S(1560)G; K(1691)R; T(1993)A; G(2042)C; 
andP(2166)L 

8. The HCV polynucleotide according to claim 3, wherein said substitution is selected 
from the group consisting of: R(1 135)K; S(1560)G; K(1691)R; T(1993)A; G(2042)C; - 
L(2155)P;andP(2166)L 

9. The HCV polynucleotide according to claim 3, wherein said substitution is selected 
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from the group consisting of: E(1202)G; l(1984)V; G(2042)C; and M(2992)T. 

10. The HCV polynucleotide according to claim 3, wherein said substitution is selected 
from the group consisting of: S(1 148)G; E(1202)G; L(1701)F; G(2042)R; and 
S(2404)P. 

11. The HCV polynucleotide according to claim 2, wherein said polynucleotide is a RNA 
molecule encoded by the DNA molecule selected from the group consisting of: SEQ 
ID NO. 2, 4, 5, 6, 7, 24 and 25. 

12. The HCV polynucleotide according to claim 2, wherein said polynucleotide is a DNA 
molecule selected from the group consisting of; SEQ ID NO. 2, 4, 5, 6, 7, 24 and 25. 

13. An expression vector comprising a DNA form of the polynucleotide according to 
claim 2, operably linked to a promoter. 

14. A host cell transfected with the self-replicating polynucleotide molecule according to 
claim 2. 

15. A host cell according to claim 14, wherein the host cell is a eukaryotic cell line. 

16. A host cell according to claim 15, wherein said eukaryotic cell line is a hepatic cell 
line. 

17. A host cell according to claim 16, wherein said hepatic cell line is Huh-7. 

18. A RNA replication assay comprising the steps of: 

a) incubating the host cell according to claim 14 under conditions suitable for 
RNA replication; 

b) isolating the total cellular RNA from the cells; and 

c) analyzing the RNA so as to measure the amount of HCV RNA replicated. 

19. The assay according to claim 18, wherein the analysis of RNA levels in step c) is 
carried out by amplifying the RNA by real-time RT-PCR analysis using HCV specific 

« 

primers so as to measure the amount of HCV RNA replicated. 
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20. The assay according to claim 18, wherein said polynucleotide encodes for a 
reporter gene, and the analysis of RNA levels in step c) is carried out by assessing 
the level of reporter expressed. 

21. A method for testing a compound for inhibiting HCV replication, including the steps 
of: 

a) carrying step a) according to claim 18, in the presence or absence of the 

* 

compound; 

b) isolating the total cellular RNA from the cells; and 

c) analyzing the RNA so as to measure the amount of HCV RNA replicated. 

d) comparing the levels of HCV RNA in cells in the absence and presence of 
the inhibitor, 

wherein reduced RNA levels is indicative of the ability of the compound to inhibit 
replication. 

22. The method according to claim 21 , wherein said cell line is incubated with the test 
compound for about 3-4 days at a temperature of about 37°C. 

23. A HCV polynucleotide molecule comprising: 

■ 

- a 5-NTR region; 

- a HCV polyprotein region coding for a HCV polyprotein comprising one or 
more amino acid substitution selected from the group consisting of: 
R(1135)K; S(1148)G; S(1560)G; K(1691)R; L(1701)F; l(1984)V; T(1993)A; 
G(2042)C; G(2042)R; S(2404)P; L(2155)P; P(2166)L and M(2992)T; and 

- a 3'-NTR region. 

24. The HCV self-replicating polynucleotide encoding a polyprotein comprising the any 
one of the amino acid substitutions as defined in claim 24, further comprising the 
amino acid substitution E(1202)G. 

25. The polynucleotide according to claim 24, wherein said substitution is a G2042C or 
a G2042R mutation. 

26. The HCV polynucleotide according to claim 24, wherein said substitution is selected 
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from the group consisting of: K(1691)R; and G(2042)C. 

27. The HCV polynucleotide according to claim 24, wherein said substitution is selected 
from the group consisting of: R(1135)K; S(1560)G; K(1691)R; T(1993)A; G(2042)C; 
and P(2166)L 

28. The HCV polynucleotide according to claim 24, wherein said substitution is selected 
from the group consisting of: R(1135)K; S(1560)G; K(1691)R; T(1993)A; G(2042)C; 
L(2155)P; and P(2166)L 

29. The HCV polynucleotide according to claim 24, wherein said substitution is selected 
from the group consisting of: E(1202)G; l(1984)V; G(2042)C; and M(2992)T. 

30. The HCV polynucleotide according to claim 24, wherein said substitution is selected 
from the group consisting of: S(1148)G; E(1202)G; L(1701)F; G(2042)R; and 
S(2404)P. 

31. The HCV polynucleotide according to claim 24, wherein said molecule is a RNA 
molecule encoded by the DNA molecule selected from the group consisting of: SEQ 
ID NO. 2, 4, 5, 6 t 7, 24 and 25. 

32. The HCV polynucleotide according to claim 24, wherein said molecule is a DNA 
molecule selected from the group consisting of: SEQ ID NO. 2, 4, 5, 6, 7, 24 and 25. 

33. An expression vector comprising a DNA form of the polynucleotide according to 
claim 24, operably linked to a promoter. 

34. A host cell transfected with the self-replicating polynucleotide according to claim 24. 

35. A host cell according to claim 34, wherein the host cell is a eukaryotic cell line. 

36. A host cell according to claim 35, wherein said eukaryotic cell line is a hepatic cell 
line. 

37. A host cell according to claim 36, wherein said hepatic cell line is Huh-7. 
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38. A RNA replication assay comprising the steps of: 

incubating the host cell according to claim 34 under conditions suitable for 
RNA replication; 

isolating the total cellular RNA from the cells; and 

analyzing the RNA so as to measure the amount of HCV RNA replicated. 

39. The assay according to claim 38, wherein the analysis of RNA levels in step c) is 
carried out by amplifying the RNA by real-time RT-PCR analysis using HCV specific 
primers so as to measure the amount of HCV RNA replicated. 

40. The assay according to claim 38, wherein said polynucleotide encodes for a 
reporter gene, and the analysis of RNA levels in step c) is carried out by assessing 
the level of reporter expressed. 

41. A method for testing a compound for inhibiting HCV replication, including the steps 
of: 

a) carrying step a) according to claim 38, in the presence or absence of the 
compound; 

b) isolating the total cellular RNA from the cells; and 

c) analyzing the RNA so as to measure the amount of HCV RNA replicated. 

d) comparing the levels of HCV RNA in cells in the absence and presence of 
the inhibitor, 

wherein reduced RNA levels is indicative of the ability of the compound to inhibit 
replication. 

42. The method according to claim 41 , wherein said cell line is incubated with the test 
compound for about 3-4 days at a temperature of about 37°C. 
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SEQUENCE LISTING 

<110> BOEHRINGER INGELHEIM (CANADA) LTD. 

<120> SELF REPLICATING RNA MOLECULE FROM 
HEPATITIS C VIRUS 

<130> 13/083 

<150> 60/257,857 
<151> 2000-12-22 

<160> 25 

* 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 8639 
<212> DNA 
<213> HCV 

■ 

<220> 
<221> CDS 

<222> (1803) . . . (8408) 
<400> 1 

ggccagcccc cgattggggg cgacactcca ccatagatca ctcccctgtg aggaactact 60 
gtcttcacgc agaaagcgtc tagccatggc gttagtatga gtgtcgtgca gcctccagga 120 
ccccccctcc cgggagagcc atagtggtct gcggaaccgg tgagtacacc ggaattgcca 180 
ggacgaccgg gtcctttctt ggatcaaccc gctcaatgcc tggagatttg ggcgtgcccc 240 
cgcgagactg ctagccgagt agtgttgggt cgcgaaaggc cttgtggtac tgcctgatag 3 00 
ggtgcttgcg agtgccccgg gaggtctcgt agaccgtgca ccatgagcac gaatcctaaa 360 
cctcaaagaa aaaccaaagg gcgcgccatg attgaacaag atggattgca cgcaggttct 420 
ccggccgctt gggtggagag gctattcggc tatgactggg cacaacagac aatcggctgc 480 
tctgatgccg ccgtgttccg gctgtcagcg caggggcgcc cggttctttt tgtcaagacc 540 
gacctgtccg gtgccctgaa tgaactgcag gacgaggcag cgcggctatc gtggctggcc 600 
acgacgggcg ttccttgcgc agctgtgctc gacgttgtca ctgaagcggg aagggactgg 660 
ctgctattgg gcgaagtgcc ggggcaggat ctcctgtcat ctcaccttgc tcctgccgag 720 
aaagtatcca tcatggctga tgcaatgcgg cggctgcata cgcttgatcc ggctacctgc 780 
ccattcgacc accaagcgaa acatcgcatc gagcgagcac gtactcggat ggaagccggt 840 
cttgtcgatc aggatgatct ggacgaagag catcaggggc tcgcgccagc cgaactgttc 900 
gccaggctca aggcgcgcat gcccgacggc gaggatctcg tcgtgaccca tggcgatgcc 960 
tgcttgccga atatcatggt ggaaaatggc cgcttttctg gattcatcga ctgtggccgg 1020 
ctgggtgtgg cggaccgcta tcaggacata gcgttggcta cccgtgatat tgctgaagag 1080 
cttggcggcg aatgggctga ccgcttcctc gtgctttacg gtatcgccgc tcccgattcg 1140 
cagcgcatcg ccttctatcg ccttcttgac gagttcttct gagttcgcgc ccagatgtta 1200 
acagaccaca acggtttccc tctagcggga tcaattccgc ccccccccct aacgttactg 1260 
gccgaagccg cttggaataa ggccggtgtg cgtttgtcta tatgttattt tccaccatat 1320 
tgccgtcttt tggcaatgtg agggcccgga aacctggccc tgtcttcttg acgagcattc 1380 
ctaggggtct ttcccctctc gccaaaggaa tgcaaggtct gttgaatgtc gtgaaggaag 1440 
cagttcctct ggaagcttct tgaagacaaa caacgtctgt agcgaccctt tgcaggcagc 1500 
ggaacccccc acctggcgac aggtgcctct gcggccaaaa gccacgtgta taagatacac 1560 
ctgcaaaggc ggcacaaccc cagtgccacg ttgtgagttg gatagttgtg gaaagagtca 1620 
aatggctctc ctcaagcgta ttcaacaagg ggctgaagga tgcccagaag gtaccccatt 1680 
gtatgggatc tgatctgggg cctcggtgca catgctttac atgtgtttag tcgaggttaa 1740 
aaaacgtcta ggccccccga accacgggga cgtggttttc ctttgaaaaa cacgataata 1800 
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cc atg gac egg gag atg gca gca teg tgc gga ggc gcg gtt ttc gta 1847 

Met Asp Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val 
1.5 10 15 

ggt ctg ata etc ttg ace ttg tea ccg cac tat aag ctg ttc etc get 1895 
Gly Leu lie Leu Leu Thr Leu Ser Pro His Tyr Lys Leu Phe Leu Ala 

20 25 30 

• 

agg etc ata tgg tgg tta caa tat ttt ate acc agg gec gag gca cac 1943 
Arg Leu lie Trp Trp Leu Gin Tyr Phe lie Thr Arg Ala Glu Ala His 

35 40 45 

ttg caa gtg tgg ate ccc ccc etc aac gtt egg ggg ggc cgc gat gee 1991 
Leu Gin Val Trp lie Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala 
50 55 60 

gtc ate etc etc acg tgc gcg ate cac cca gag eta ate ttt acc ate 2039 
Val lie Leu Leu Thr Cys Ala lie His Pro Glu Leu lie Phe Thr He 
65 70 75 

acc aaa ate ttg etc gee ata etc ggt cca etc atg gtg etc cag get 2087 
Thr Lys He Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala 
80 85 90 95 

ggt ata acc aaa gtg ccg tac ttc gtg cgc gca cac ggg etc att cgt 2135 
Gly He Thr Lys Val Pro Tyr Phe Val Arg Ala His Gly Leu He Arg 

100 105 110 

gca tgc atg ctg gtg egg aag gtt get ggg ggt cat tat gtc caa atg 2183 
Ala Cys Met Leu Val Arg Lys Val Ala Gly Gly His Tyr Val Gin Met 

115 120 125 

get etc atg aag ttg gee gca ctg aca ggt acg tac gtt tat gac cat 2231 
Ala Leu Met Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His 
130 135 140 

etc acc cca ctg egg gac tgg gee cac gcg ggc eta cga gac ctt gcg 2279 
Leu Thr Pro Leu Arg Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala 
145 150 155 

gtg gca gtt gag ccc gtc gtc ttc tct gat atg gag acc aag gtt ate 2327 
Val Ala Val Glu Pro Val Val Phe Ser Asp Met Glu Thr Lys Val He 
160 165 170 175 

acc tgg ggg gca gac acc gcg gcg tgt ggg gac ate ate ttg ggc ctg 2375 
Thr Trp Gly Ala Asp Thr Ala Ala Cys Gly Asp He He Leu Gly Leu 

180 185 190 

ccc gtc tec gee cgc agg ggg agg gag ata cat ctg gga ccg gca gac 2423 
Pro Val Ser Ala Arg Arg Gly Arg Glu He His Leu Gly Pro Ala Asp 

195 200 205 

age ctt gaa ggg cag ggg tgg cga etc etc gcg cct att acg gec tac 2471 
Ser Leu Glu Gly Gin Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr 
210 215 220 
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tec caa cag acg cga ggc eta ctt ggc tgc ate ate act age etc aca 2519 

Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr 
225 230 235 

ggc egg gac agg aac cag gtc gag ggg gag gtc caa gtg gtc tec ace 2567 
Gly Arg Asp Arg Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr 
240 245 250 255 

gca aca caa tct ttc ctg gcg ace tgc gtc aat ggc gtg tgt tgg act 2615 
Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr 

260 265 270 

gtc tat cat ggt gee ggc tea aag ace ctt gee ggc cca aag ggc cca 2663 
Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro 

275 280 285 

ate acc caa atg tac ace aat gtg gac cag gac etc gtc ggc tgg caa 2711 
He Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin 
290 295 300 

■ 

gcg ccc ccc ggg gcg cgt tec ttg aca cca tgc acc tgc ggc age teg 2759 
Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser 
305 310 315 

gac ctt tac ttg gtc acg agg cat gee gat gtc att ccg gtg cgc egg 2807 
Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val He Pro Val Arg Arg 
320 325 330 335 

egg ggc gac age agg ggg age eta etc tec ccc agg ccc gtc tec tac 2855 
Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr 

340 345 350 

ttg aag ggc tct teg ggc ggt cca ctg etc tgc ccc teg ggg cac get 2903 
Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ser Gly His Ala 

355 360 365 

gtg ggc ate ttt egg get gee gtg tgc acc cga ggg gtt gcg aag gcg 2951 
Val Gly He Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala 
370 375 380 

gt£T gac ttt gta ccc gtc gag tct atg gaa acc act atg egg tec ccg 2999 
Val Asp Phe Val Pro Val Glu Ser Met Glu Thr Thr Met Arg Ser Pro 
385 390 395 

gtc ttc acg gac aac teg tec cct ccg gee gta ccg cag aca ttc cag 3047 
Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Thr Phe Gin 
400 405 410 415 

gtg gec cat eta cac gee cct act ggt age ggc aag age act aag gtg 3095 
Val Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val 

420 425 430 

ccg get gcg tat gca gec caa ggg tat aag gtg ctt gtc ctg aac ccg 3143 
Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 

435 440 445 
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tec gtc gec gec acc eta ggt ttc ggg gcg tat atg tct aag gca cat 3191 
Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
450 455 460 



ggt ate gac cct aac ate aga acc ggg gta agg acc ate acc acg ggt 
Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly 
465 470 475 



3239 



gee ccc ate acg tac tec acc tat ggc aag ttt ctt gee gac ggt ggt 3287 
Ala Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
480 485 490 495 

tgc tct ggg ggc gec tat gac ate ata ata tgt gat gag tgc cac tea 3335 
Cys Ser Gly Gly Ala Tyr Asp lie lie lie Cys Asp Glu Cys His Ser 

500 505 510 

act gac teg acc act ate ctg ggc ate ggc aca gtc ctg gac caa gcg 3383 
Thr Asp Ser Thr Thr lie Leu Gly lie Gly Thr Val Leu Asp Gin Ala 

515 520 525 

gag acg get gga gcg cga etc gtc gtg etc gee acc get acg cct ccg 3431 
Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
530 535 540 

gga teg gtc acc gtg cca cat cca aac ate gag gag gtg get ctg tec 3479 
Gly Ser Val Thr Val Pro His Pro Asn lie Glu Glu Val Ala Leu Ser 
545 550 555 

age act gga gaa ate ccc ttt tat ggc aaa gee ate ccc ate gag acc 3527 
Ser Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro lie Glu Thr 
560 565 570 575 

ate aag ggg ggg agg cac etc att ttc tgc cat tec aag aag aaa tgt 3575 
He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 

580 585 590 

gat gag etc gee gcg aag ctg tec ggc etc gga etc aat get gta gca 3623 
Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly Leu Asn Ala Val Ala 

595 600 605 

tat tac egg ggc ctt gat gta tec gtc ata cca act age gga gac gtc 3671 
Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly . Asp Val 
610 615 620 

att gtc gta gca acg gac get eta atg acg ggc ttt acc ggc gat ttc 3719 
He Val Val Ala Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe 
625 630 635 

gac tea gtg ate gac tgc aat aca tgt gtc acc cag aca gtc gac ttc 3767 
Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
640 645 650 655 



age ctg gac ccg acc ttc acc att gag acg acg acc gtg cca caa gac 
Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp 

660 665 670 



3815 
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gcg gtg tea cgc teg cag egg cga ggc agg act ggt agg ggc agg atg 3863 
Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Met 

675 680 685 

ggc att tac agg ttt gtg act cca gga gaa egg ccc teg ggc atg ttc 3911 
Gly He Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe 
690 695 700 

gat tec teg gtt ctg tgc gag tgc tat gac gcg ggc tgt get tgg tac 3959 
Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
705 710 715 

gag etc acg ccc gee gag ace tea gtt agg ttg egg get tac eta aac 4007 
Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn 
720 725 730 735 

aca cca ggg ttg ccc gtc tgc cag gac cat ctg gag ttc tgg gag age 4055 
Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ser 

740 745 750 

gtc ttt aca ggc etc ace cac ata gac gee cat ttc ttg tec cag act 4103 
Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr 

755 760 765 

aag cag gca gga gac aac ttc ccc tac ctg gta gca tac cag get acg 4151 
Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
770 775 780 

gtg tgc gee agg get cag get cca cct cca teg tgg gac caa atg tgg 4199 
Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
785 790 795 

aag tgt etc ata egg eta aag cct acg ctg cac ggg cca acg ccc ctg 4247 
Lys Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
800 805 810 815 

ctg tat agg ctg gga gee gtt caa aac gag gtt act ace aca cac ccc 4295 
Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Thr Thr His Pro 

820 825 830 

ata acc aaa tac ate atg gca tgc atg teg get gac ctg gag gtc gtc 4343 
He Thr Lys Tyr He Met Ala Cys Met Ser Ala Asp Leu Glu Val Val 

835 840 845 

acg age acc tgg gtg ctg gta ggc gga gtc eta gca get ctg gec gcg 4391 
Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
850 855 860 

tat tgc ctg aca aca ggc age gtg gtc att gtg ggc agg ate ate ttg 4439 
Tyr Cys Leu Thr Thr Gly Ser Val Val He Val Gly Arg He He Leu 
865 870 875 



tec gga aag ccg gee ate att ccc gac agg gaa gtc ctt tac egg gag 4487 
Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
880 885 890 895 
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ttc gat gag atg gaa gag tgc gcc tea cac etc cct tac ate gaa cag 4535 
Phe Asp Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin 

900 905 910 

gga atg cag etc gcc gaa caa ttc aaa cag aag gca ate ggg ttg ctg 4583 
Gly Met Gin Leu Ala Glu Gin Phe Lys Gin Lys Ala He Gly Leu Leu 

915 920 925 

caa aca gcc acc aag caa gcg gag get get get ccc gtg gtg gaa tec 4631 
Gin Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser 
930 935 940 

aag tgg egg acc etc gaa gcc ttc tgg gcg aag cat atg tgg aat ttc 4679 
Lys Trp Arg Thr Leu Glu Ala Phe Trp Ala Lys His Met Trp Asn Phe 
945 950 955 

ate age ggg ata caa tat tta gca ggc ttg tec act ctg cct ggc aac 4727 
He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
960 965 970 975 

ccc gcg ata gca tea ctg atg gca ttc aca gcc tct ate acc age ccg 4775 
Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ser He Thr Ser Pro 

980 985 990 

etc acc acc caa cat acc etc ctg ttt aac ate ctg ggg gga tgg gtg 4823 
Leu Thr Thr Gin His Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val 

995 1000 1005 

gcc gcc caa ctt get cct ccc age get get tct get ttc gta ggc gcc 4871 
Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala 
1010 1015 1020 

ggc ate get gga gcg get- gtt ggc age ata ggc ctt ggg aag gtg ctt 4919 
Gly He Ala Gly Ala Ala Val Gly Ser He Gly Leu Gly Lys Val Leu 
1025 1030 1035 

gtg gat att ttg gca ggt tat gga gca ggg gtg gca ggc gcg etc gtg ■ 4967 
Val Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
1040 1045 1050 1055 

gcc ttt aag gtc atg age ggc gag atg ccc tec acc gag gac ctg gtt 5015 
Ala Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val 

1060 1065 1070 

aac eta etc cct get ate etc tec cct ggc gcc eta gtc gtc ggg gtc 5063 
Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 

1075 1080 1085 

gtg tgc gca gcg ata ctg cgt egg cac gtg ggc cca ggg gag ggg get 5111 
Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
1090 1095 1100 

gtg cag tgg atg aac egg ctg ata gcg ttc get teg egg ggt aac cac 5159 
Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
1105 1110 1115 
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gtc tec ccc acg cac tat gtg cct gag age gac get gca gca cgt gtc 5207 
Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
1120 1125 1130 1135 

act cag ate etc tct agt ctt ace ate act cag ctg ctg aag agg ctt 5255 
Thr Gin lie Leu Ser Ser Leu Thr lie Thr Gin Leu Leu Lys Arg Leu 

1140 1145 1150 

cac cag tgg ate aac gag gac tgc tec acg cca tgc tec ggc teg tgg 5303 
His Gin Trp lie Asn Glu Asp Cys Ser Thr Pro Cys Ser Gly Ser Trp 

1155 1160 1165 

eta aga gat gtt tgg gat tgg ata tgc acg gtg ttg act gat ttc aag 5351 
Leu Arg Asp Val Trp Asp Trp lie Cys Thr Val Leu Thr Asp Phe Lys 
1170 1175 1180 

ace tgg etc cag tec aag etc ctg ccg cga ttg ccg gga gtc ccc ttc 5399 
Thr Trp Leu Gin Ser Lys Leu Leu Pro Arg Leu Pro Gly Val Pro Phe 
1185 1190 1195 

ttc tea tgt caa cgt ggg tac aag gga gtc tgg egg ggc gac ggc ate 5447 
Phe Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly lie 
1200 1205 1210 1215 

atg caa ace ace tgc cca tgt gga gca cag ate ace gga cat gtg aaa 5495 
Met Gin Thr Thr Cys Pro Cys Gly Ala Gin lie Thr Gly His Val Lys 

1220 1225 1230 

aac ggt tec atg agg ate gtg ggg cct agg ace tgt agt aac acg tgg 5543 
Asn Gly Ser Met Arg He Val Gly Pro Arg Thr Cys Ser Asn Thr Trp 

1235 1240 1245 

cat gga aca ttc ccc att aac gcg tac ace acg ggc ccc tgc acg ccc 5591 
His Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
1250 1255 1260. 

tec ccg gcg cca aat tat tct agg gcg ctg tgg egg gtg get get gag 5639 
Ser Pro Ala Pro Asn Tyr Ser Arg Ala Leu Trp Arg Val Ala Ala Glu 
1265 1270 1275 

gag tac gtg gag gtt acg egg gtg ggg gat ttc cac tac gtg acg ggc 5687 
Glu Tyr Val Glu Val Thr Arg Val Gly Asp Phe His Tyr Val Thr Gly 
1280 1285 1290 1295 

atg acc act gac aac gta aag tgc ccg tgt cag gtt ccg gee ccc gaa 5735 
Met Thr Thr Asp Asn Val Lys Cys Pro Cys Gin Val Pro Ala Pro Glu 

1300 1305 1310 

ttc ttc aca gaa gtg gat ggg gtg egg ttg cac agg tac get cca gcg 5783 
Phe Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr Ala Pro Ala 

1315 1320 1325 

tgc aaa ccc etc eta egg gag gag gtc aca ttc ctg gtc ggg etc aat 5831 
Cys Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Leu Val Gly Leu Asn 
1330 1335 1340 
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caa tac ctg gtt ggg tea cag etc cca tgc gag ccc gaa ccg gac gta 5879 
Gin Tyr Leu Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
1345 1350 1355 

gca gtg etc act tec atg etc acc gac ccc tec cac att acg gcg gag 5927 
Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu 
1360 1365 ' 1370 1375 

acg get aag cgt agg ctg gee agg gga tct ccc ccc tec ttg gee age 5975 
Thr Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser 

1380. 1385 1390 

tea tea get age cag ctg tct gcg cct tec ttg aag gca aca tgc act 6023 
Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 

1395 1400 1405 

acc cgt cat gac tec ccg gac get gac etc ate gag gee aac etc ctg 6071 
Thr Arg His Asp Ser Pro Asp Ala Asp Leu lie Glu Ala Asn Leu Leu 
1410 1415 1420 

tgg egg cag gag atg ggc ggg aac ate acc cgc gtg gag tea gaa aat 6119 
Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn 
1425 1430 1435 

aag gta gta att ttg gac tct ttc gag ccg etc caa gcg gag gag gat 6167 
Lys Val Val lie Leu Asp Ser Phe Glu Pro Leu Gin Ala Glu Glu Asp 
1440 1445 1450 1455 

gag agg gaa gta tec gtt ccg gcg gag ate ctg egg agg tec agg aaa 6215 
Glu Arg Glu Val Ser Val Pro Ala Glu lie Leu Arg Arg Ser Arg Lys 

1460 1465 1470 

•ttc cct cga gcg atg ccc ata tgg gca cgc ccg gat tac aac cct cca 6263 
Phe Pro Arg Ala Met Pro lie Trp Ala Arg Pro Asp Tyr Asn Pro Pro 

1475 1480 1485 

ctg tta gag tec tgg aag gac ccg gac tac gtc cct cca gtg gta cac 6311 
Leu Leu Glu Ser Trp Lys Asp Pro Asp Tyr Val Pro Pro Val Val His 
1490 1495 1500 

555 tgt cca ttg ccg cct gee aag gee cct ccg ata cca cct cca egg 6359 
Gly Cys Pro Leu Pro Pro Ala Lys Ala Pro Pro lie Pro Pro Pro Arg 
1505 1510 1515 

agg aag agg acg gtt gtc ctg tea gaa tct acc gtg tct tct gee ttg 6407 
Arg Lys Arg Thr Val Val Leu Ser Glu Ser Thr Val Ser Ser Ala Leu 
1520 1525 1530 1535 

gcg gag etc gec aca aag acc ttc ggc age tec gaa teg teg gee gtc 6455 
Ala Glu Leu Ala Thr Lys Thr Phe Gly Ser Ser Glu Ser Ser Ala Val 

1540 1545 1550 

gac age ggc acg gca acg gee tct cct gac cag ccc tec gac gac ggc 6503 
Asp Ser Gly Thr Ala Thr Ala Ser Pro Asp Gin Pro Ser Asp Asp Gly 

1555 1560 1565 
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gac gcg gga tec gac gtt gag teg tac tec tec atg ccc ccc ctt gag 6551 
Asp Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1570 1575 1580 

ggg gag ccg ggg gat ccc gat etc age gac ggg tct tgg tct acc gta 6599 
Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1585 1590 1595 

age gag gag get agt gag gac gtc gtc tgc tgc teg atg tec tac aca 6647 
Ser Glu Glu Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr 
1600 1605 1610 1615 

tgg aca ggc gec ctg ate acg cca tgc get gcg gag gaa acc aag ctg 6695 
Trp Thr Gly Ala Leu He Thr Pro Cys Ala Ala Glu Glu Thr Lys Leu 

1620 1625 1630 

ccc ate aat gca ctg age aac tct ttg etc cgt cac cac aac ttg gtc 6743 
Pro He Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val 

1635 1640 1645 

tat get aca aca tct cgc age gca age ctg egg cag aag aag gtc acc 6791 
Tyr Ala Thr Thr Ser Arg Ser Ala Ser Leu Arg Gin Lys Lys Val Thr 
1650 1655 1660 

ttt gac aga ctg cag gtc ctg gac gac cac tac egg gac gtg etc aag 6839 
Phe Asp Arg Leu Gin Val Leu Asp Asp His Tyr Arg Asp Val Leu Lys 
1665 1670 1675 

gag atg aag gcg aag gcg tec aca gtt aag get aaa ctt eta tec gtg 6887 
Glu Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu Leu. Ser Val 
1680 1685 1690 1695 

gag gaa gee tgt aag ctg acg ccc cca cat teg gee aga tct aaa ttt 6935 
Glu Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Arg Ser Lys Phe 

1700 1705 1710 



ggc tat ggg gca aag gac gtc egg aac eta tec age aag gee gtt aac 6983 
Gly Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys Ala Val Asn 

1715 1720 1725 

cac ate cgc tec gtg tgg aag gac ttg ctg gaa gac act gag aca cca "7031 
His He Arg Ser Val Trp Lys Asp Leu Leu Glu Asp Thr Glu Thr Pro 
1730 1735 1740 

att gac acc acc ate atg gca aaa aat gag gtt ttc tgc gtc caa cca 7079 
He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro 
1745 1750 1755 



gag aag ggg ggc cgc aag cca get cgc 
Glu Lys Gly Gly Arg Lys Pro Ala Arg 
1760 1765 

ggg gtt cgt gtg tgc gag aaa atg gee 
Gly Val Arg Val Cys Glu Lys Met Ala 

1780 



ctt ate gta ttc cca gat ttg 7127 
Leu He Val Phe Pro Asp Leu 
1770 1775 

ctt tac gat gtg gtc tec acc 7175 
Leu Tyr Asp Val Val Ser Thr 
1785 1790 
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etc cct cag gec gtg atg ggc tct tea tac gga ttc caa tac tct cct 7223 
Leu Pro Gin Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro 

1795 1800 1805 

gga cag egg gtc gag ttc ctg gtg aat gee tgg aaa gcg aag aaa tgc 7271 
Gly Gin Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ala Lys Lys Cys 
1810 1815 1820 

cct atg ggc ttc gca tat gac acc cgc tgt ttt gac tea acg gtc act 7319 
Pro Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr 
1825 1830 1835 

gag aat gac ate cgt gtt gag gag tea ate tac caa tgt tgt gac ttg 7367 
Glu Asn Asp lie Arg Val Glu Glu Ser lie Tyr Gin Cys Cys Asp Leu 
1840 1845 1850 1855 

gec ccc gaa gec aga cag gee ata agg teg etc aca gag egg ctt tac 7415 
Ala Pro Glu Ala Arg Gin Ala lie Arg Ser Leu Thr Glu Arg Leu Tyr 

1860 1865 1870 

ate ggg ggc ccc ctg act aat : tct aaa ggg cag aac tgc ggc tat cgc 7463 
lie Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg 

1875 1880 1885 

c gg tgc cgc gcg age ggt gta ctg acg acc age tgc ggt aat acc etc 7511 
Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu 
1890 1895 1900 

aca tgt tac ttg aag gec get gcg gee tgt cga get gcg aag etc cag 755 9 
Thr Cys Tyr Leu Lys Ala Ala Ala Ala Cys Arg Ala Ala Lys Leu Gin 
1905 1910 1915 

gac tgc acg atg etc gta tgc gga gac gac ctt gtc gtt ate tgt gaa 7607 
Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys Glu 
1920 1925 1930 1935 

age gcg ggg acc caa gag gac gag gcg age eta egg gec ttc acg gag 7655 
Ser Ala Gly Thr Gin Glu Asp Glu Ala Ser Leu Arg Ala Phe Thr Glu 

1940 1945 1950 

get atg act aga tac tct gee ccc cct ggg gac ccg ccc aaa cca gaa 7703 
Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Lys Pro Glu 

1955 1960 1965 

tac gac ttg gag ttg ata aca tea tgc tec tec aat gtg tea gtc gcg 7751 
Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val Ala 
1970 1975 1980 

cac gat gca tct ggc aaa agg gtg tac tat etc acc cgt gac ccc acc 7799 
His Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr 
1985 1990 1995 

acc ccc ctt gcg egg get gcg tgg gag aca get aga cac act cca gtc 7847 
Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val 
2000 2005 2010 2015 
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aat tec tgg eta ggc aac ate ate atg tat gcg ccc ace ttg tgg gca 7895 
Asn Ser Trp Leu Gly Asn He He Met Tyr Ala Pro Thr Leu Trp Ala 

2020 2025 * 2030 

agg atg ate ctg atg act cat ttc ttc tec ate ctt eta get cag gaa 7943 
Arg Met He Leu Met Thr His Phe Phe Ser He Leu Leu Ala Gin Glu 

2035 2040 2045 

caa ctt gaa aaa gee eta gat tgt cag ate tac ggg gee tgt tac tec 7991 
Gin Leu Glu Lys Ala Leu Asp Cys Gin He Tyr Gly Ala Cys Tyr Ser 
2050 2055 2060 

att gag cca ctt gac eta cct cag ate att caa cga etc cac ggc ctt 8039 
He Glu Pro Leu Asp Leu Pro Gin He He Gin Arg Leu His Gly Leu 
2065 2070 2075 

age gca ttt tea etc cat agt tac tct cca ggt gag ate aat agg gtg 8087 
Ser Ala, Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val 
2080 2085 2090 2095 

get tea tgc etc agg aaa ctt ggg gta ccg ccc ttg cga gtc tgg aga 8135 
Ala Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg 

2100 2105 2110 

cat egg gee aga agt gtc cgc get agg eta ctg tec cag ggg ggg agg 8183 
His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ser Gin Gly Gly Arg 

2115 2120 2125 

get gec act tgt ggc aag tac etc ttc aac tgg gca gta agg acc aag 8231 
Ala Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys 
2130 2135 2140 

etc aaa etc act cca ate ccg get gcg tec cag ttg gat tta tec age 8279 
Leu Lys Leu Thr Pro He Pro Ala Ala Ser Gin Leu Asp Leu Ser Ser 
2145 2150 2155 

tgg ttc gtt get ggt tac age ggg gga gac ata tat cac age ctg tct 8327 
Trp Phe Val Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser 
2160 2165 2170 2175 

cgt gee cga ccc cgc tgg ttc atg tgg tgc eta etc eta ctt tct gta 8375 
Arg Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val 

2180 2185 2190 

« 

999 gta ggc ate tat eta etc ccc aac cga tga aeggggaget aaacactcca 8428 
Gly Val Gly He Tyr Leu Leu Pro Asn Arg * 

2195 2200 
ggecaatagg ccatcctgtt tttttccctt tjttttttttc tttttttttt tttttttttt 8488 
tttttttttt ttttctcctt tttttttcct ctttttttcc ttttctttcc tttggtggct 8548 
ccatcttagc cctagtcacg gctagctgtg aaaggtccgt gagcegcttg actgeagaga 8608 
gtgetgatae tggcctctct gcagatcaag t 8639 

<210> 2 
<211> 8642 
<212> DNA 
<213> HCV 



<220> 
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<221> CDS 

<222> (1802) ... (8407) 

<221> variation 
<222> 6268 
<223> r = a or g 
<221> variation 
<222> 4446 
<223> r = a or g 

<400> 2 

accagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 
tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 
cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 
gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 
gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 360 
ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 
cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 
ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 
acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 
cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 
tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 72 0 
aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 
cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 
ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 
ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgect 960 
gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 
tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 
ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 
agcgcatcgc cttctatcgc cttcttgacg agttcttctg agttcgcgcc cagatgttaa 1200 
cagaccacaa cggtttccct ctagcgggat caattccgcc ccccccccta acgttactgg 1260 
ccgaagccgc ttggaataag gccggtgtgc gtttgtctat atgttatttt ccaccatatt 132 0 
gccgtctttt ggcaatgtga gggcccggaa acctggccct gtcttcttga cgagcattcc 1380 
taggggtctt tcccctctcg ccaaaggaat gcaaggtctg ttgaatgtcg tgaaggaagc 1440 
agttcctctg gaagcttctt gaagacaaac aacgtctgta gcgacccttt gcaggcagcg 1500 
gaacccccca cctggcgaca ggtgcctctg cggccaaaag ccacgtgtat aagatacacc 1560 
tgcaaaggcg gcacaacccc agtgccacgt tgtgagttgg atagttgtgg aaagagtcaa 1620 
atggctctcc tcaagcgtat tcaacaaggg gctgaaggat gcccagaagg taccccattg 1680 
tatgggatct gatctggggc ctcggtgcac atgctttaca tgtgtttagt cgaggttaaa 174 0 
aaacgtctag gccccccgaa ccacggggac gtggttttcc tttgaaaaac acgataatac 1800 
c atg gac egg gag atg gca gca teg tgc gga ggc gcg gtt ttc gta ggt 1849 
Met Asp Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly 
15 10 15 

ctg ata etc ttg acc ttg tea ccg cac tat aag ctg ttc etc get agg 1897 
Leu lie Leu Leu Thr Leu Ser Pro His Tyr Lys Leu Phe Leu Ala Arg 

20 25 30 

etc ata tgg tgg tta caa tat ttt ate acc agg gee gag gca cac ttg 1945 
Leu lie Trp Trp Leu Gin Tyr Phe lie Thr Arg Ala Glu Ala His Leu 
35 40 45 

caa gtg tgg ate ccc ccc etc aac gtt egg ggg ggc cgc gat gee gtc 1993 
Gin Val Trp lie Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val 
50 55 60 
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ate etc etc acg tgc gcg ate cac cca gag eta ate ttt ace ate ace 2041 

lie Leu Leu Thr Cys Ala lie His Pro Glu Leu lie Phe Thr lie Thr 
65 70 75 80 

aaa ate ttg etc gee ata etc ggt cca etc atg gtg etc cag get ggt 2089 
Lys lie Leu Leu Ala lie Leu Gly Pro Leu Met Val Leu Gin Ala Gly 

85 90 95 

ata acc aaa gtg ccg tac ttc gtg cgc gca cac ggg etc att cgt gca 2137 
He Thr Lys Val Pro Tyr Phe Val Arg Ala His Gly Leu He Arg Ala 

100 105 no 

tgc atg ctg gtg egg aag gtt get ggg ggt cat tat gtc caa atg get 2185 
Cys Met Leu Val Arg Lys Val Ala Gly Gly His Tyr Val Gin Met Ala 
115 120 125 

etc atg aag ttg gee gca ctg aca ggt acg tac gtt tat gac cat etc 2233 
Leu Met Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu 
130 135 140 

acc cca ctg egg gac tgg gee cac gcg ggc eta cga gac ctt gcg gtg 2281 
Thr Pro Leu Arg Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val 
145 150 155 160 

gca gtt gag ccc gtc gtc ttc tct gat atg gag acc aag gtt ate acc 2329 
Ala Val Glu Pro Val Val Phe Ser Asp Met Glu Thr Lys Val He Thr * 

165 170 175 

tgg ggg gca gac acc gcg gcg tgt ggg gac ate ate ttg ggc ctg ccc 2377 
Trp Gly Ala Asp Thr Ala Ala Cys Gly Asp He He Leu Gly Leu Pro 

180 185 190 

gtc tec gec cgc agg ggg agg gag ata cat ctg gga ccg gca gac age 2425 
Val Ser Ala Arg Arg Gly Arg Glu He His Leu Gly Pro Ala Asp Ser 
195 200 205 

ctt gaa ggg cag ggg tgg cga etc etc gcg cct att acg gee tac tec 2473 
Leu Glu Gly Gin Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ser 
210 215 220 

caa cag acg cga ggc eta ctt ggc tgc ate ate act age etc aca ggc 2521 
Gin Gin Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly 
225 230 235 240 

egg gac agg aac cag gtc gag ggg gag gtc caa gtg gtc tec acc gca 2569 
Arg Asp Arg Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala 

245 250 255 

aca caa tct ttc ctg gcg acc tgc gtc aat ggc gtg tgt tgg act gtc 2617 
Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val 

260 265 270 

tat cat ggt gee ggc tea aag acc ctt gee ggc cca aag ggc cca ate 2665 
Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro He 
275 280 285 
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acc caa atg tac acc aat gtg gac cag gac etc gtc ggc tgg caa gcg 2713 

Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin Ala 
290 295 300 ' 

ccc ccc ggg gcg cgt tec ttg aca cca tgc acc tgc ggc age teg gac 2761 
Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp 
305 310 315 320 

ctt tac ttg gtc acg agg cat gee gat gtc att ccg gtg cgc egg egg 2809 
Leu Tyr Leu Val Thr Arg His Ala Asp Val lie Pro Val Arg Arg Arg 

325 330 335 

ggc gac age agg ggg age eta etc tec ccc agg ccc gtc tec tac ttg 2857 
Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu 

340 345 350 

aag ggc tct teg ggc ggt cca ctg etc tgc ccc teg ggg cac get gtg 2 905 
Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ser Gly His Ala Val 
355 360 365 

ggc ate ttt egg get gee gtg tgc acc cga ggg gtt gcg aag gcg gtg 2953 
Gly lie Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val 
370 375 380 

gac ttt gta ccc gtc gag tct atg gaa acc act atg egg tec ccg gtc 3001 
Asp Phe Val Pro Val Glu Ser Met Glu Thr Thr Met Arg Ser Pro Val 
385 390 395 400 

ttc acg gac aac teg tec cct ccg gee gta ccg cag aca ttc cag gtg 3049 
Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Thr Phe Gin Val 

405 410 415 

gec cat eta cac gee cct act ggt age ggc aag age act aag gtg ccg 3097 
Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

420 425 430 

get gcg tat gea-gee caa ggg tat aag gtg ctt gtc ctg aac ccg tec 3145 
Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser 
435 440 445 

gtc gee gee acc eta ggt ttc ggg gcg tat atg tct aag gca cat ggt 3193 
Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly 
450 455 460 

ate gac cct aac ate aga acc ggg gta agg acc ate acc acg ggt gec 3241 
He Asp Pro Asn lie Arg Thr Gly Val Arg Thr He Thr Thr Gly Ala 
465 470 475 480 

ccc ate acg tac tec acc tat ggc aag ttt ctt gec gac ggt ggt tgc 3289 
Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys 

485 490 495 

tct ggg ggc gee tat gac ate ata ata tgt gat gag tgc cac tea act 3337 
Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr 

500 505 510 
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gac teg acc act ate ctg ggc ate ggc aca gtc ctg gac caa gcg gag 3385 
Asp Ser Thr Thr lie Leu Gly lie Gly Thr Val Leu Asp Gin Ala Glu 
515 520 525 

acg get gga gcg cga etc gtc gtg etc gee acc get acg cct ccg gga 3433 
Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 
530 535 540 

teg gtc acc gtg cca cat cca aac ate gag gag gtg get ctg tec age 3481 
Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser Ser 
545 550 555 560 

act gga gaa ate ccc ttt tat ggc aaa gee ate ccc ate gag acc ate 3529 
Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro He Glu Thr He 

565 570 575 

aa 9 999 999 a 99 cac etc att ttc tgc cat tec aag aag aaa tgt gat 3577 
Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp 

580 585 590 

gag etc gee gcg aag ctg tec ggc etc gga etc aat get gta gca tat 3625 
Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly Leu Asn Ala Val Ala Tyr 
595 600 605 

tac egg ggc ctt gat gta tec gtc ata cca act age gga gac gtc att 3673 
Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val He 
610 615 620 

gtc gta gca acg gac get eta atg acg ggc ttt acc ggc gat ttc gac 3721 
Val Val Ala Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp 
625 630 635 640 

tea gtg ate gac tgc aat aca tgt gtc acc cag aca gtc gac ttc age 3769 
Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser 

645 650 655 

ctg gac ccg acc ttc acc att gag acg acg acc gtg cca caa gac gcg 3817 
Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala 

660 665 670 

gtg tea cgc teg cag egg cga ggc agg act ggt agg ggc agg atg ggc 3865 
Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Met Gly 
675 680 685 

att tac agg ttt gtg act cca gga gaa egg ccc teg ggc atg ttc gat 3913 
He Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp 
690 695 700 

tec teg gtt ctg tgc gag tgc tat gac gcg ggc tgt get tgg tac gag 3961 
Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu 
705 710 715 720 

etc acg ccc gec gag acc tea gtt agg ttg egg get tac eta aac aca 4009 
Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr 

725 730 735 
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cca ggg ttg ccc gtc tgc cag gac cat ctg gag ttc tgg gag age gtc 4057 
Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ser Val 

740 745 750 

ttt aca ggc etc acc cac ata gac gec cat ttc ttg tec cag act aag 4105 
Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys 
• 755 760 765 

cag gca gga gac aac ttc ccc tac ctg gta gca tac cag get acg gtg 4153 
Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 
77a 775 780 

tgc gee agg get cag get cca cct cca teg tgg gac caa atg tgg aag 4201 
Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys 
785 790 795 800 

tgt etc ata egg eta aag cct acg ctg cac ggg cca acg ccc ctg ctg 4249 
Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu 

805 810 815 

tat agg ctg gga gec gtt caa aac gag gtt act acc aca cac ccc ata 4297 
Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Thr Thr His Pro He 

820 825 830 

acc aaa tac ate atg gca tgc atg teg get gac ctg gag gtc gtc acg 4345 
Thr Lys Tyr He Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr 
835 840 845 

age acc tgg gtg ctg gta ggc gga gtc eta gca get ctg gee gcg tat 4393 
Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr 
850 855 860 

tgc ctg aca aca ggc age gtg gtc att gtg ggc agg ate ate ttg tec 4441 
Cys Leu Thr Thr Gly Ser Val Val He Val Gly Arg He He Leu Ser 
865 870 875 880 

gga arg ccg gec ate att ccc gac agg gaa gtc ctt tac egg gag ttc 4489 
Gly Xaa Pro Ala lie He Pro Asp Arg Glu Val Leu Tyr -Arg Glu Phe 

885 890 895 

gat gag atg gaa gag tgc gee tea cac etc cct tac ate gaa cag gga 4537 
Asp Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly 

900 905 910 

atg cag etc gee gaa caa ttc aaa cag aag gca ate ggg ttg ctg caa 4585 
Met Gin Leu Ala Glu Gin Phe Lys Gin Lys Ala He Gly Leu Leu Gin 
915 920 925 

aca gee acc aag caa gcg gag get get get ccc gtg gtg gaa tec aag 4633 
Thr Ala Thr Lys Gin Ala Glu' Ala Ala Ala Pro Val Val Glu Ser Lys 
930 935 940 

tgg egg acc etc gaa gee ttc tgg gcg aag cat atg tgg aat ttc ate 4681 
Trp Arg Thr Leu Glu Ala Phe Trp Ala Lys His Met Trp Asn Phe He 
945 950 955 960 



WO 02/052015 



PCT/CA01/01843 



17/93 

a 9 c 999 ata caa tat tta gca ggc ttg tec act ctg cct ggc aac ccc 4729 

Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro 

965 970 975 

gcg ata gca tea ctg atg gca ttc aca gec tct ate acc age ccg etc 4777 

Ala He Ala Ser Leu Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu 

980 985 990 

acc acc caa cat acc etc ctg ttt aac ate ctg ggg gga tgg gtg gee 4825 
Thr Thr Gin His Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala 
995 1000 1005 



gec caa ctt get cct ccc age get get tct get ttc gta ggc gee ggc 
Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly 
1010 1015 1020 



4873 . 



ate get gga gcg get gtt ggc age ata ggc ctt ggg aag gtg ctt gtg 
He Ala Gly Ala Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val 
1025 • 1030 1035 1040 



4921 



gat att ttg gca ggt tat gga gca ggg gtg gca ggc gcg etc gtg gec 
Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala 

1045 1050 1055 



4969 



ttt aag gtc atg age ggc gag atg ccc tec acc gag gac ctg gtt aac 
Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn 

1060 1065 1070 



5017 



eta etc cct get ate etc tec cct ggc gee eta gtc gtc ggg gtc gtg 
Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val 
1075 1080 1085 



5065 



tgc gca gcg ata ctg cgt egg cac gtg ggc cca ggg gag ggg get gtg 
Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val 
1090 1095 1100 



5113 



cag tgg atg aac egg ctg ata gcg ttc get teg egg ggt aac cac gtc 
Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val 
1105 mo 1115 1120 



5161 



tec ccc acg cac tat gtg cct gag age gac get gca gca cgt gtc act 
Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr 

1125 1130 1135 



5209 



cag ate etc tct agt ctt acc ate act cag ctg ctg aag agg ctt cac 5257 
Gin He Leu Ser Ser Leu Thr He Thr Gin Leu Leu Lys Arg Leu His 

1140 H45 1150 

cag tgg ate aac gag gac tgc tec acg cca tgc tec ggc teg tgg eta 5305 
Gin Trp He Asn Glu Asp Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu 
1155 H60 1165 



aga gat gtt tgg gat tgg ata tgc acg gtg ttg act gat ttc aag acc 
Arg Asp Val Trp Asp Trp He Cys Thr Val Leu Thr Asp Phe Lys Thr 
H70 H75 1180 



5353 
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tgg etc cag tec aag etc ctg ccg cga ttg ccg gga gtc ccc ttc ttc 5401 
Trp Leu Gin Ser Lys Leu Leu Pro Arg Leu Pro Gly Val Pro Phe Phe 
1185 1190 1195 1200 

tea tgt caa cgt ggg tac aag gga gtc tgg egg ggc gac ggc ate atg 5449 
Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly lie Met 

1205 1210 1215 

caa acc ace tgc cca tgt gga gca cag ate acc gga cat gtg aaa aac 5497 
Gin Thr Thr Cys Pro Cys Gly Ala Gin He Thr Gly His Val Lys Asn 

1220 1225 1230 

tgt tec atg agg ate gtg ggg cct agg acc tgt agt aac acg tgg cat 5545 
Cys Ser Met Arg He Val Gly Pro Arg Thr Cys Ser Asn Thr Trp His 
1235 1240 1245 

gga aca ttc ccc att aac gcg tac acc acg ggc ccc tgc acg ccc tec 5593 
Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser 
1250 1255 1260 

ccg gcg cca aat tat tct agg gcg ctg tgg egg gtg get get gag gag 5641 
Pro Ala Pro Asn Tyr Ser Arg Ala Leu Trp Arg Val Ala Ala Glu Glu 
1265 1270 1275 1280 

tac gtg gag gtt acg egg gtg ggg gat ttc cac tac gtg acg ggc atg 5689 
Tyr Val Glu Val Thr Arg Val Gly Asp Phe His Tyr Val Thr Gly Met 

1285 1290 1295 

acc act gac aac gta aag tgc ccg tgt cag gtt ccg gec ccc gaa ttc 5737 
Thr Thr Asp Asn Val Lys Cys Pro Cys Gin Val Pro Ala Pro Glu Phe 

1300 1305 1310 

ttc aca gaa gtg gat ggg gtg egg ttg cac agg tac get cca gcg tgc 5785 
Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys 
1315 1320 1325 

aaa ccc etc eta egg gag gag gtc aca ttc ctg gtc ggg etc aat caa 5833 
Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Leu Val Gly Leu Asn Gin 
1330 1335 1340 

tac ctg gtt ggg tea cag etc cca tgc gag ccc gaa ccg gac gta gca 5881 
Tyr Leu Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala 
1345 1350 1355 1360 

* 

gtg etc act tec atg etc acc gac ccc tec cac att acg gcg gag acg 5929 
Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Thr 

1365 1370 1375 

get aag cgt agg ctg gee agg gga tct ccc ccc tec ttg gee age tea 5977 
Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser Ser 

1380 1385 1390 

tea get age cag ctg tct gcg cct tec ttg aag gca aca tgc act acc 6025 
Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Thr 
1395 1400 1405 
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cgt cat gac tec ccg gac get gac etc ate gag gee aac etc ctg tgg 6073 
Arg His Asp Ser Pro Asp Ala Asp Leu lie Glu Ala Asn Leu Leu Trp 
1410 1415 1420 

egg cag gag atg ggc ggg aac ate acc cgc gtg gag tea gaa aat aag 6121 
Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn Lys 
1425 1430 1435 1440 



gta 


gta 


att 


ttg gac 


tct 


ttc 


gag 


ccg 


etc 


caa 


gcg 


gag 


gag 


gat 


gag 


6169 


Val 


Val 


He Leu Asp Ser 


Phe 


Glu 


Pro 


Leu 


Gin 


Ala 


Glu 


Glu 


Asp 


Glu 












1445 








1450 








1455 




agg 


gaa 


gta 


tec 


gtt 


ccg 


gcg 


gag 


ate 


ctg 


egg 


agg 


tec 


agg 


aaa 


ttc 


6217 


Arg 


Glu 


Val 


Ser 


Val 


Pro 


Ala 


Glu 


He 


Leu 


Arg 


Arg 


Ser Arg 


Lys 


Phe 










1460 








1465 








1470 






cct 


cga 


gcg 


atg 


ccc 


ata 


tgg 


gca 


cgc 


ccg 


gat 


tac 


aac 


cct 


cca 


ctg 


6265 


Pro 


Arg 


Ala 


Met 


Pro 


He 


Trp 


Ala 


Arg 


Pro 


Asp 


Tyr Asn 


Pro 


Pro 


Leu 








1475 








1480 








1485 








ttr 


gag 


tec 


tgg 


aag 


gac 


ccg 


gac 


tac 


gtc 


cct 


cca gtg 


gta 


cac 


ggg 


6313 


Xaa 


Glu 


Ser 


Trp 


Lys 


Asp 


Pro 


Asp 


Tyr 


Val 


Pro 


Pro 


Val 


Val 


His 


Gly 






1490 








1495 








1500 










tgt 


cca 


ttg 


ccg 


cct 


gee 


aag 


gee 


cct 


ccg 


ata 


cca 


cct 


cca 


egg 


agg 


6361 


Cys 


Pro 


Leu 


Pro 


Pro 


Ala 


Lys 


Ala 


Pro 


Pro 


He 


Pro 


Pro 


Pro 


Arg 


Arg 




1505 








1510 








1515 








1520 




aag 


agg 


acg 


gtt 


gtc 


ctg 


tea 


gaa 


tct 


acc 


gtg 


tct 


tct 


gec 


ttg 


gcg 


6409 


Lys 


Arg 


Thr 


Val 


Val 


Leu 


Ser 


Glu 


Ser 


Thr 


Val 


Ser 


Ser 


Ala 


Leu 


Ala 












1525 








1530 








1535 




gag 


etc 


gee 


aca 


aag 


acc 


ttc 


ggc 


age 


tec 


gaa 


teg. 


teg 


gee 


gtc 


gac 




Glu 


Leu 


Ala 


Thr 


Lys 


Thr 


Phe 


Gly 


Ser 


Ser 


Glu 


Ser 


Ser 


Ala 


Val 


Asp 










1540 








1545 








1550 




age 


ggc 


acg 


gca 


acg 


gee 


tct 


cct 


gac 


cag 


ccc 


tec 


gac 


gac 


ggc 


gac 


6505 


Ser 


Gly 


Thr 


Ala 


Thr 


Ala 


Ser 


Pro 


Asp 


Gin 


Pro 


Ser Asp 


Asp 


Gly 


Asp 








1555 








1560 








1565 








gcg 


gga 


tec 


gac 


gtt 


gag 


teg 


tac 


tec 


tec 


atg 


ccc 


ccc 


ctt 


gag 


ggg 


6553 


Ala 


Gly 


Ser 


Asp 


Val 


Glu 


Ser 


Tyr 


Ser 


Ser 


Met 


Pro 


Pro 


Leu 


Glu 


Gly 






1570 








1575 








1580 








gag 


ccg 


ggg 


gat 


ccc 


gat 


etc 


age 


gac 


ggg 


tct 


tgg 


tct 


acc 


gta 


age 


6601 


Glu 


Pro 


Gly 


Asp 


Pro 


Asp 


Leu 


Ser 


Asp 


Gly 


Ser 


Trp 


Ser 


Thr 


Val 


Ser 




1585 








1590 








1595 








1600 




gag 


gag 


get 


agt 


gag 


gac 


gtc 


gtc 


tgc 


tgc 


teg atg 


tec 


tac 


aca 


tgg 


6649 


Glu 


Glu 


Ala 


Ser 


Glu 


Asp 


Val 


Val 


Cys 


Cys 


Ser 


Met 


Ser 


Tyr 


Thr 


Trp 












1605 








1610 








1615 




aca 


ggc 


gee 


ctg 


ate 


acg 


cca 


tgc 


get 


gcg 


gag 


gaa 


acc 


aag 


ctg 


ccc 


6697 


Thr 


Gly 


Ala 


Leu 


lie 


Thr 


Pro 


Cys 


Ala 


Ala 


Glu 


Glu 


Thr 


Lys 


Leu 


Pro 






k 




1620 








1625 








1630 






ate 


aat 


gca 


ctg 


age 


aac 


tct 


ttg 


etc 


cgt 


cac 


cac 


aac 


ttg 


gtc 


tat 


6745 


He 


Asn 


Ala 


Leu 


Ser 


Asn 


Ser 


Leu 


Leu Arg His His 


Asn 


Leu 


Val 


Tyr 





1635 1640 1645 
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get aca aca tct cgc age gca age ctg egg cag aag aag gtc ace ttt 6793 
Ala Thr Thr Ser Arg Ser Ala Ser Leu Arg Gin Lys Lys Val Thr Phe 
1650 1655 1660 

gac aga ctg cag gtc ctg gac gac cac tac egg gac gtg etc aag gag 6841 
Asp Arg Leu Gin Val Leu Asp Asp His Tyr Arg Asp Val Leu Lys Glu 
1665 1670 1675 1680 

atg aag gcg aag gcg tec aca gtt aag get aaa ctt eta tec gtg gag 6889 
Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu Leu Ser Val Glu 

1685 1690 1695 

gaa gee tgt aag ctg acg ccc cca cat teg gee aga tct aaa ttt ggc 6937 
Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Arg Ser Lys Phe Gly 

1700 1705 1710 

tat ggg gca aag gac gtc egg aac eta tec age aag gee gtt aac cac 6985 
Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys Ala Val Asn His 
1715 1720 1725 

ate cgc tec gtg tgg aag gac ttg ctg gaa gac act gag aca cca att 7033 
lie Arg Ser Val Trp Lys Asp Leu Leu Glu Asp Thr Glu Thr Pro He 
1730 1735 1740 

gac ace ace ate atg gca aaa aat gag gtt ttc tgc gtc caa cca gag 7081 
Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu 
1745 1750 1755 1760 

aag ggg ggc cgc aag cca get cgc ctt ate gta ttc cca gat ttg ggg 7129 
Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly 

1765 1770 1775* 

gtt cgt gtg tgc gag aaa atg gec ctt tac gat gtg gtc tec acc etc 7177 
Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Thr Leu 

1780 1785 1790 

cct cag gee gtg atg ggc tct tea tac gga ttc caa tac tct cct gga 7225 
Pro Gin Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly 
1795 1800 1805 

cag egg gtc gag ttc ctg gtg aat gec tgg aaa gcg aag aaa tgc cct 7273 
Gin Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ala Lys Lys Cys Pro 
1810 1815 1820 

■ 

atg ggc ttc gca tat gac acc cgc tgt ttt gac tea acg gtc act gag 7321 
Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1825 1830 1835 1840 

aat gac ate cgt gtt gag gag tea ate tac caa tgt tgt gac ttg gee 7369 
Asn Asp He Arg Val Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Ala 

1845 1850 1855 

ccc gaa gee aga cag gee ata agg teg etc aca gag egg ctt tac ate 7417 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

1860 • 1865 1870 
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999 99& ccc ctg act aat tct aaa ggg cag aac tgc ggc tat cgc egg 7465 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 
1875 1880 1885 

tgc cgc gcg age ggt gta ctg acg acc age tgc ggt aat acc etc aca 7513 
Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 
1890 1895 1900 

tgt tac ttg aag gec get gcg gec tgt cga get gcg aag etc cag gac 7561 
Cys Tyr Leu Lys Ala Ala Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
1905 1910 1915 1920 

tgc acg atg etc gta tgc gga gac gac ctt gtc gtt ate tgt gaa age 7609 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys Glu Ser 

1925 1930 1935 

9^9 999 acc caa gag gac gag gcg age eta egg gec ttc acg gag get 7657 
Ala Gly Thr Gin Glu Asp Glu Ala Ser Leu Arg Ala Phe Thr Glu Ala 

1940 1945 1950 

atg act aga tac tct gec ccc cct ggg gac ccg ccc aaa cca gaa tac 7705 
Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Lys Pro Glu Tyr 
1955 1960 1965 

gac ttg gag ttg ata aca tea tgc tec tec aat gtg tea gtc gcg cac 7753 
Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 
1970 1975 .1980 

gat gca tct ggc aaa agg gtg tac tat etc acc cgt gac ccc acc acc 7801 
Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
1985 1990 1995 2000 

ccc ctt gcg egg get gcg tgg gag aca get aga cac act cca gtc aat 7849 
Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 

2005 2010 * 2015 

tec tgg eta ggc aac ate ate atg tat gcg ccc acc ttg tgg gca agg 7897 
Ser Trp Leu Gly Asn lie lie Met Tyr Ala Pro Thr Leu Trp Ala Arg 

2020 2025 2030 

atg ate ctg atg act cat ttc ttc tec ate ctt eta get cag gaa caa 7945 
Met lie Leu Met Thr His Phe Phe Ser He Leu Leu Ala Gin Glu Gin 
2035 2040 2045 

ctt gaa aaa gec eta gat tgt cag ate tac ggg gec tgt tac tec att 7993 
Leu Glu Lys Ala Leu Asp Cys Gin He Tyr Gly Ala Cys Tyr Ser He 
2050 2055 2060 

gag cca ctt gac eta cct cag ate att caa cga etc cac ggc ctt age 8041 
Glu Pro Leu Asp Leu Pro Gin He He Gin Arg Leu His Gly Leu Ser 
2065 2070 2075 2080 

gca ttt tea etc cat agt tac tct cca ggt gag ate aat agg gtg get 8089 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala 

2085 2090 2095 



WO 02/052015 



PCT/CA01/01843 



22/93 

tea tgc etc agg aaa ctt ggg gta ccg ccc ttg cga gtc tgg aga cat 8137 
Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

2100 2105 2110 

egg gee aga agt gtc cgc get agg eta ctg tec cag ggg ggg agg get 8185 
Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ser Gin Gly Gly Arg Ala 
2115 2120 2125 

gee act tgt ggc aag tac etc ttc aac tgg gca gta agg ace aag etc 8233 
Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 
2130 2135 2140 

aaa etc act cca ate ccg get gcg tec cag ttg gat tta tec age tgg 8281 
Lys Leu Thr Pro lie Pro Ala Ala Ser Gin Leu Asp Leu Ser Ser Trp 
2145 2150 2155 2160 

ttc gtt get ggt tac age ggg gga gac ata tat cac age ctg tct cgt 8329 
Phe Val Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser Arg 

2165 2170 2175 

gee cga ccc cgc tgg ttc atg tgg tgc eta etc eta ctt tct gta ggg 8377 
Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly 

2180 2185 2190 

gta ggc ate tat eta etc ccc aac cga tga aeggggaget aaacactcca 8427 
Val Gly He Tyr Leu Leu Pro Asn Arg * 
2195 2200 

ggecaatagg ccatcctgtt tttttcccct tttttttttt tttttttttc tttttttttt 8487 
tttttttttt tttttttttc tccttttttt tcctcttttt ttccttttct ttcctttggt 8547 
ggctccatct tagecctagt cacggctagc tgtgaaaggt ccgtgagccg ettgactgea 8607 
gagagtgctg atactggect etctgeagat caagt 8642 

<210> 3 

<211> 2201 

<212> PRT 

<213> HCV 

<220> 

<221> VARIANT 
<222> 882 

<223> Xaa is Lys or Arg 

<221> VARIANT 
<222> 1489 
<223> Xaa is Leu 

<400> 3 

Met Asp Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly 

1 5 io 15 

Leu He Leu Leu Thr Leu Ser Pro His Tyr Lys Leu Phe Leu Ala Arg 

20 25 30 

Leu He Trp Trp Leu Gin Tyr Phe He Thr Arg Ala Glu Ala His Leu 

35 • 40 45 

Gin Val Trp He Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val 

50 55 6 o 

He Leu Leu Thr Cys Ala He His Pro Glu Leu He Phe Thr He Thr 
65 70 75 80 
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Lys 


lie 


Leu 


Leu 


Ala 
85 


He 


Leu 


Gly 


lie 


Thr 


Lys 


Val 

100 


Pro 


Tyr 


Phe 


Val 


Cys 


Met 


Leu 
115 


Val 


Arg 


Lys 


Val 


Ala 
120 


Leu 


Met 

13 0 


Lys 


Leu 


Ala 


Ala 


Leu 
135 


Thr 


Thr 


Pro 


Leu 


Arg 


Asp 


Trp 


Ala 


His 


145 










150 






Ala 


Val 


Glu 


Pro 


Val 
165 


Val 


Phe 


Ser 


Trp 


Gly 


Ala 


Asp 
180 


Thr 


Ala 


Ala 


Cys 


Val 


Ser 


Ala 
195 


Arg 


Arg 


Gly 


Arg 


Glu 
200 


Leu 


Glu 
210 


Gly 


Gin 


Gly 


Trp 


Arg 
215 


Leu 


Gin 


Gin 


Thr 


Arg 


Gly 


Leu 


Leu 


Gly 


225 










230 






Arg 


Asp 


Arg 


Asn 


Gin 
245 


Val 


Glu 


Gly 


Thr 


Gin 


Ser 


Phe 
260 


Leu 


Ala 


Thr 


Cys 


Tyr 


His 


Gly 
275 


Ala 


Gly 


Ser 


Lys 


Thr 
280 


Thr 


Gin 
290 


Met 


Tyr 


Thr 


Asn 


Val 
295 


Asp 


Pro 


Pro 


Gly 


Ala 


Arg 


Ser 


Leu 


Thr 


305 










310 






Leu 


Tyr 


Leu 


Val 


Thr 
325 


Arg 


His 


Ala 


Gly 


Asp 


Ser 


Arg 
340 


Gly 


Ser 


Leu 


Leu 


Lys 


Gly 


Ser 
355 


Ser 


Gly 


Gly 


Pro 


Leu 
360 


Gly 


He 
370 


Phe 


Arg 


Ala 


Ala 


Val 
375 


Cys 


Asp 


Phe 


Val 


Pro 


Val 


Glu 


Ser 


Met 


385 










390 






Phe 


Thr 


Asp 


Asn 


Ser 
405 


Ser 


Pro 


Pro 


Ala 


• 

His 


Leu 


His 
420 


Ala 


Pro 


Thr 


Gly 


Ala 


Ala 


Tyr 
435 


Ala 


Ala 


Gin 


Gly 


Tyr 
440 


Val 


Ala 
450 


Ala 


Thr 


Leu 


Gly 


Phe 
455 


Gly 


He 


Asp 


Pro 


Asn 


He 


Arg 


Thr 


Gly 


465 










470 






Pro 


He 


Thr 


Tyr 


Ser 
485 


Thr 


Tyr 


Gly 


Ser 


Gly 


Gly 


Ala 
500 


Tyr 


Asp 


He 


He 


Asp 


Ser 


Thr 
515 


Thr 


He 


Leu 


Gly 


He 
520 


Thr 


Ala 
530 


Gly 


Ala 


Arg 


Leu 


Val 
535 


Val 
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Pro 


Leu 


Met Val Leu 


Gin 


Ala 


Gly 




90 






95 




Arg 


Ala 


His Gly Leu 


He 


Arg 


Ala 


105 






110 






Gly 


Gly 


His Tyr Val 


Gin 


Met 


Ala 






125 








Gly 


Thr 


Tyr Val Tyr 


Asp 


His 


Leu 






140 








Ala 


Gly 


Leu Arg Asp 


Leu 


Ala 


Val 






155 






160 


Asp 


Met 


Glu Thr Lys 


Val 


He 


Thr 




170 






175 




Gly 


Asp 


He He Leu 


Gly 


Leu 


Pro 


185 






190 






He 


His 


Leu Gly Pro 


Ala 


Asp 


Ser 






205 








Leu 


Ala 


Pro He Thr 


Ala 


Tyr 


Ser 






220 








Cys 


He 


He Thr Ser 


Leu 


Thr 


Gly 






235 






240 


Glu 


Val 


Gin Val Val 


Ser 


Thr 


Ala 




250 






255 




Val 


Asn 


Gly Val Cys 


Trp 


Thr 


Val 


265 






270 






Leu 


Ala 


Gly Pro Lys 


Gly 


Pro 


He 






285 








Gin 


Asp 


Leu Val Gly 


Trp 


Gin 


Ala 






300 








Pro 


Cys 


Thr Cys Gly 


Ser 


Ser 


Asp 






315 






320 


Asp 


Val 


He Pro Val 


Arg 


Arg 


Arg 




330 






335 




Ser 


Pro 


Arg Pro Val 


Ser 


Tyr 


Leu 


345 






350 






Leu 


Cys 


Pro Ser Gly 


His 


Ala 


Val 






365 








Thr 


Arg 


Gly Val Ala 


Lys 


Ala 


Val 






380 








Glu 


Thr 


Thr Met Arg 


Ser 


Pro 


Val 






395 






400 


Ala 


Val 


Pro Gin Thr 


Phe 


Gin 


Val 




410 






415 




Ser 


Gly 


Lys Ser Thr 


Lys 


Val 


Pro 


425 






430 






Lys 


Val 


Leu Val Leu 


Asn 


Pro 


Ser 






445 








Ala 


Tyr 


Met Ser Lys 


Ala 


His 


Gly 






460 








Val 


Arg 


Thr He Thr 


Thr 


Gly 


Ala 






475 






480 


Lys 


Phe 


Leu Ala Asp 


Gly 


Gly 


Cys 




490 






495 




He 


Cys 


Asp Glu Cys 


His 


Ser 


Thr 


505 






510 






Gly 


Thr 


Val Leu Asp 


Gin 


Ala 


Glu 






525 








Leu 


Ala 


Thr Ala Thr 


Pro 


Pro 


Gly 



540 
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Ser 


Val 


Thr 


Val 


Pro 


His 


Pro 


Asn 


He 


Glu 


Glu Val Ala Leu 


Ser Ser 


545 










550 










555 


560 


Thr 


Gly 


Glu 


He 


Pro 


Phe Tyr 


Gly 


Lys 


Ala 


He Pro He Glu 


Thr He 










565 










570 




575 


Lys 


Gly 


Gly 


Arg 


His 


Leu 


He 


Phe 


Cys 


His 


Ser Lys Lys Lys 


Cys Asp 








580 










585 




590 




Glu 


Leu 


Ala 


Ala 


Lys 


Leu 


Ser 


Gly 


Leu 


Gly 


Leu Asn Ala Val 


Ala Tyr 






595 










600 






605 




Tyr 


Arg 


Gly 


Leu 


Asp 


Val 


Ser 


Val 


He 


Pro 


Thr Ser Gly Asp 


Val He 




610 










615 








620 




Val 


Val 


Ala 


Thr 


Asp 


Ala 


Leu 


Met 


Thr 


Gly 


Phe Thr Gly Asp 


Phe Asp 


625 










630 










635 


640 


Ser 


Val 


He 


Asp 


Cys 


Asn 


Thr 


Cys 


Val 


Thr 


Gin Thr Val Asp 


Phe Ser 










645 










650 




655 


Leu 


Asp 


Pro 


Thr 


Phe 


Thr 


He 


Glu 


Thr 


Thr 


Thr Val Pro Gin 


Asp Ala 








660 










665 




670 




Val 


Ser 


Arg 


Ser 


Gin 


Arg Arg 


Gly 


Arg 


Thr 


Gly Arg Gly Arg 


Met Gly 






675 










680 






685 




lie 


Tyr 


Arg 


Phe 


Val 


Thr 


Pro 


Gly 


Glu 


Arg 


Pro Ser Gly Met 


Phe Asp 




690 










695 








700 




Ser 


Ser 


Val 


Leu 


Cys 


Glu 


Cys 


Tyr 


Asp 


Ala 


Gly Cys Ala Trp 


Tyr Glu 


705 










710 










715 


720 


Leu 


Thr 


Pro 


Ala 


Glu 


Thr 


Ser 


Val 


Arg 


Leu 


Arg Ala Tyr Leu 


Asn Thr 










725 










730 




735 


Pro 


Gly 


Leu 


Pro 


Val 


Cys 


Gin 


Asp 


His 


Leu 


Glu Phe Trp Glu 


Ser Val 








740 










745 




750 




Phe 


Thr 


Gly 


Leu 


Thr 


His 


He 


Asp 


Ala 


His 


Phe Leu Ser Gin 


Thr Lys 






755 










760 






765 




Gin 


Ala 


Gly 


Asp 


Asn 


Phe 


Pro 


Tyr 


Leu 


Val 


Ala Tyr Gin Ala 


Thr Val 




770 










775 








780 




Cys 


Ala 


Arg 


Ala 


Gin 


Ala 


Pro 


Pro 


Pro 


Ser 


Trp Asp Gin Met 


Trp Lys 


785 










790 










795 


800 


Cys 


Leu 


He 


Arg 


Leu 


Lys 


Pro 


Thr 


Leu 


His 


Gly Pro Thr Pro 


Leu Leu 










805 










810 




815 


Tyr 


Arg 


Leu 


Gly 


Ala 


Val 


Gin 


Asn 


Glu 


Val 


Thr Thr Thr His 


Pro He 








820 










825 




830 




Thr 


Lys 


Tyr 


He 


Met 


Ala 


Cys 


Met 


Ser 


Ala 


Asp Leu Glu Val 


Val Thr 






835 










840 






845 




Ser 


Thr 


Trp 


Val 


Leu 


Val Gly 


Gly 


Val 


Leu 


Ala Ala Leu Ala 


Ala Tyr 




850 










855 








860 




Cys 


Leu 


Thr 


Thr 


Gly 


Ser 


Val 


Val 


He 


Val 


Gly Arg He He 


Leu Ser 


865 










870 










875 


880 


Gly 


Xaa 


Pro 


Ala 


He 


He 


Pro 


Asp 


Arg 


Glu 


Val Leu Tyr Arg 


Glu Phe 










885 










890 




895 


Asp 


Glu 


Met 


Glu 


Glu 


Cys Ala 


Ser 


His 


Leu 


Pro Tyr He Glu 


Gin' Gly 








900 










905 




910 




Met 


Gin 


Leu 


Ala 


Glu 


Gin 


Phe 


Lys 


Gin 


Lys 


Ala He Gly Leu 


Leu Gin 






915 










920 






925 




Thr 


Ala 


Thr 


Lys 


Gin 


Ala 


Glu 


Ala 


Ala 


Ala 


Pro Val Val Glu 


Ser Lys 




930 










935 








940 




Trp 


Arg 


Thr 


Leu 


Glu 


Ala 


Phe 


Trp 


Ala 


Lys 


His Met Trp Asn 


Phe He 


945 










950 










955 


960 


Ser 


Gly 


He 


Gin 


Tyr 


Leu 


Ala 


Gly 


Leu 


Ser 


Thr Leu Pro Gly 


Asn Pro 










965 










970 




975 


Ala 


He 


Ala 


Ser 


Leu 


Met 


Ala 


Phe 


Thr 


Ala 


Ser He Thr Ser. 


Pro Leu 








980 










985 




990 




Thr 


Thr 


Gin 


His 


Thr 


Leu 


Leu 


Phe 


Asn 


He 


Leu Gly Gly Trp Val Ala 



995 1000 1005 
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Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly 

1010 1015 1020 

lie Ala Gly Ala Ala Val Gly Ser lie Gly Leu Gly Lys Val Leu Val 
1025 1030 1035 1040 

Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala 

1045 1050 1055 

Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn 

1060 1065 1070 

Leu Leu Pro Ma He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val 

1075 1080 1085 

Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val 

1090 1095 1100 

Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val 
1105 1110 1115 1120 

Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr 

1125 1130 1135 

Gin He Leu Ser Ser Leu Thr He Thr Gin Leu Leu Lys Arg Leu His 

1140 1145 1150 

Gin Trp He Asn Glu Asp Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu 

1155 1160 1165 

Arg Asp Val Trp Asp Trp He Cys Thr Val Leu Thr Asp Phe Lys Thr 

1170 1175 1180 

Trp Leu Gin Ser Lys Leu Leu Pro Arg Leu Pro Gly Val Pro Phe Phe 
1185 1190 1195 1200 

Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He Met 

1205 1210 1215 

Gin Thr Thr Cys Pro Cys Gly Ala Gin lie Thr Gly His Val Lys Asn 

1220 1225 1230 

Cys Ser Met Arg He Val Gly Pro Arg Thr Cys Ser Asn Thr Trp His 

1235 1240 1245 

Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser 

1250 1255 1260 

Pro Ala Pro Asn Tyr Ser Arg Ala Leu Trp Arg Val Ala Ala Glu Glu 
1265 1270 1275 1280 

Tyr Val Glu Val Thr Arg Val Gly Asp Phe His Tyr Val Thr Gly Met 

1285 1290 1295 

Thr Thr- Asp Asn Val Lys Cys Pro Cys Gin Val Pro Ala Pro Glu Phe 

1300 1305 1310 

Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys 

1315 1320 1325 

Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Leu Val Gly Leu Asn Gin 

1330 1335 1340 

Tyr Leu Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala 
1345 1350 1355 1360 

Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Thr 

1365 1370 1375 

Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser Ser 

1380 1385 - 1390 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Thr 

1395 1400 1405 

Arg His Asp Ser Pro Asp Ala Asp Leu He Glu Ala Asn Leu Leu Trp 

1410 1415 1420 

Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn Lys 
1425 1430 1435 1440 

Val Val He Leu Asp Ser Phe Glu Pro Leu Gin Ala Glu Glu Asp Glu 

1445 * 1450 1455 

Arg Glu Val Ser Val Pro Ala Glu He Leu Arg Arg Ser Arg Lys Phe 

1460 1465 1470 
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Pro Arg Ala Met Pro He Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu 

1475 1480 1485 

Xaa Glu Ser Trp Lys Asp Pro Asp Tyr Val Pro Pro Val Val His Gly 

1490 1495 1500 

Cys Pro Leu Pro Pro Ala Lys Ala Pro Pro lie Pro Pro Pro Arg Arg 
1505 1510 1515 1520 

Lys Arg Thr Val Val Leu Ser Glu Ser Thr Val Ser Ser Ala Leu Ala 

1525 1530 1535 

Glu Leu Ala Thr Lys Thr Phe Gly Ser Ser Glu Ser Ser Ala Val Asp 

1540 1545 1550 

Ser Gly Thr Ala Thr Ala Ser Pro Asp Gin Pro Ser Asp Asp Gly Asp 

1555 1560 1565 

Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly 

1570 1575 1580 

Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser 
1585 1590 1595 1600 

Glu Glu Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp 

1605 1610 1615 

Thr Gly Ala Leu He Thr Pro Cys Ala Ala Glu Glu Thr Lys Leu Pro 

1620 1625 1630 

He Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr 

1635 1640 1645 

Ala Thr Thr Ser Arg Ser Ala Ser Leu Arg Gin Lys Lys Val Thr Phe 

1650 1655 1660. 

Asp Arg Leu Gin Val Leu Asp Asp His Tyr Arg Asp Val Leu Lys Glu 
1665 1670 1675 1680 

Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu Leu Ser Val Glu 

1685 1690 1695 

Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Arg Ser Lys Phe Gly 

1700 1705 1710 

Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys Ala Val Asn His 

1715 1720 1725 

He Arg Ser Val Trp Lys Asp Leu Leu Glu Asp Thr Glu Thr Pro lie 

1730 1735 1740 

Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu 
1745 1750 1755 1760 

Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly 

1765 1770 1775 

Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Thr Leu 

1780 1785 1790 

Pro Gin Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly 

1795 1800 1805 

Gin Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ala Lys Lys Cys Pro 

1B10 1815 1820 

Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1825 1830 1835 1840 

Asn Asp He Arg Val Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Ala 

1845 1850 1855 

Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

1860 1865 1870 

Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 

1875 1880 1885 

Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 

1890 1895 1900 

Cys Tyr Leu Lys Ala Ala Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
1905 1910 1915 1920 

Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

1925 1930 1935 



WO 02/052015 



PCT/CA01/01843 



27/93 

Ala Gly Thr Gin Glu Asp Glu Ala Ser Leu Arg Ala Phe Thr Glu Ala 

1940 1945 1950 

Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Lys Pro Glu Tyr 

1955 1960 1965 

Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 

1970 1975 1980 

Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
1985 ^ 1990 1995 2000 

Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 

2005 2010 2015 

Ser Trp Leu Gly Asn lie lie Met Tyr Ala Pro Thr Leu Trp Ala Arg 

2020 2025 2030 

Met lie Leu Met Thr His Phe Phe Ser He Leu Leu Ala Gin Glu Gin 

2035 2040 2045 

Leu Glu Lys Ala Leu Asp Cys Gin He Tyr Gly Ala Cys Tyr Ser He 

2050 2055 2060 

Glu Pro Leu Asp Leu Pro Gin He He Gin Arg Leu His Gly Leu Ser 
2065 2070 2075 2080 

Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala 

2085 2090 2095 

Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

2100 2105 2110 

Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ser Gin Gly Gly Arg Ala 

2115 2120 2125 

Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 

2130 2135 2140 

Lys Leu Thr Pro He Pro Ala Ala Ser Gin Leu Asp Leu Ser Ser Trp 
2145 2150 2155 2160 

Phe Val Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser Arg 

2165 2170 2175 

Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly 

2180 2185 2190 

Val Gly He Tyr Leu Leu Pro Asn Arg 
2195 2200 



<210> 4 
<211> 8643 
<212> DNA 
<213> HCV 



<220> 
<221> CDS 

<222> (1802) . . . (8407) 
<400> 4 

accagccccc gattgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccc gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gatcaacccg 
gcgagactgc tagccgagta gtgttgggtc 
gtgcttgcga gtgccccggg aggtctcgta 
ctcaaagaaa aaccaaaggg cgcgccatga 
cggccgcttg ggtggagagg ctattcggct 
ctgatgccgc cgtgttccgg ctgtcagcgc 
acctgtccgg tgccctgaat gaactgcagg 
cgacgggcgt tccttgcgca gctgtgctcg 
tgctattggg cgaagtgccg gggcaggatc 
aagtatccat catggctgat gcaatgcggc 



catagatcac tcccctgtga ggaactactg 60 
ttagtatgag tgtcgtgcag cctccaggac 120 
cggaaccggt gagtacaccg gaattgccag 180 
ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgaaaggcc ttgtggtact gcctgatagg 300 
gaccgtgcac catgagcacg aatcctaaac 360 
ttgaacaaga tggattgcac gcaggttctc 420 
atgactgggc gcaacagaca atcggctgct 480 
aggggcgccc ggttcttttt gtcaagaccg 54 0 
acgaggcagc gcggctatcg tggctggcca 600 
acgttgtcac tgaagcggga agggactggc 660 
tcctgtcatc tcaccttgct cctgccgaga 720 
ggctgcatac gcttgatccg gctacctgcc 780 
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cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 
ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 
ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 
gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 
tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 
ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 
agcgcatcgc cttctatcgc cttcttgacg agttcttctg agttcgcgcc cagatgttaa 1200 
cagaccacaa cggtttccct ctagcgggat caattccgcc ccccccccta acgttactgg 1260 
ccgaagccgc ttggaataag gccggtgtgc gtttgtctat atgttatttt ccaccatatt 1320 
gccgtctttt ggcaatgtga gggcccggaa acctggccct gtcttcttga cgagcattcc 1380 
taggggtctt tcccctctcg ccaaaggaat gcaaggtctg ttgaatgtcg tgaaggaagc 1440 
agttcctctg gaagcttctt gaagacaaac aacgtctgta gcgacccttt gcaggcagcg 1500 
gaacccccca cctggcgaca ggtgcctctg cggccaaaag ccacgtgtat aagatacacc 1560 
tgcaaaggcg gcacaacccc agtgccacgt tgtgagttgg atagttgtgg aaagagtcaa 1620 
atggctctcc tcaagcgtat tcaacaaggg gctgaaggat gcccagaagg taccccattg 168 0 
tatgggatct gatctggggc ctcggtgcac atgctttaca tgtgtttagt cgaggttaga 1740 
aaacgtctag gccccccgaa ccacggggac gtggttttcc tttgaaaaac acgataatac 1800 
c atg gac egg gag atg gca gca teg tgc gga ggc gcg gtt ttc gta ggt 1849 
Met Asp Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly 
1 5 10 15 

ctg ata etc ttg ace ttg tea ccg cac tat aag ctg ttc etc get agg 1897 
Leu lie Leu Leu Thr Leu Ser Pro His Tyr Lys Leu Phe Leu Ala Arg 

20 25 30 

etc ata tgg tgg tta caa tat ttt ate ace agg gee gag gca cac ttg 1945 
Leu lie Trp Trp Leu Gin Tyr Phe lie Thr Arg Ala Glu Ala His Leu 
35 40 45 

caa gtg tgg ate ccc ccc etc aac gtt egg ggg ggc cgc gat gec gtc 1993 
Gin Val Trp lie Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val 
50 55 60 

ate etc etc acg tgc gcg ate cac cca gag eta ate ttt ace ate ace 2041 
lie Leu Leu Thr Cys Ala lie His Pro Glu Leu lie Phe Thr lie Thr 
65 70 75 80 

aaa ate ttg etc gee ata etc ggt cca etc atg gtg etc cag get ggt 2089 
Lys lie Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala Gly 

85 90 95 

ata ace aaa gtg ccg tac ttc gtg cgc gca cac ggg etc att cgt gca 2137 
He Thr Lys Val Pro Tyr Phe Val Arg Ala His Gly Leu He Arg Ala 

100 105 110 

♦ 

tgc atg ctg gtg egg aag gtt get ggg ggt cat tat gtc caa atg get 2185 
Cys Met Leu Val Arg Lys Val Ala Gly Gly His Tyr Val Gin Met Ala 
115 120 125 

etc atg aag ttg gee gca ctg aca ggt acg tac gtt tat gac cat etc 2233 
Leu Met Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu 
130 135 140 



ace cca ctg egg gac tgg gee cac gcg ggc eta cga gac ctt gcg gtg 2281 
Thr Pro Leu Arg Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val 
145 150 155 160 
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gca gtt gag ccc gtc gtc ttc tct gat atg gag acc aag gtt ate acc 2329 
Ala Val Glu Pro Val Val Phe Ser Asp Met Glu Thr Lys Val lie Thr 

165 170 175 

tgg ggg gca gac acc gcg gcg tgt ggg gac ate ate ttg ggc ctg ccc 2377 
Trp Gly Ala Asp Thr Ala Ala Cys Gly Asp lie lie Leu Gly Leu Pro 

180 185 190 

gtc tec gee cgc agg ggg agg gag ata cat ctg gga ccg gca gac age 2425 
Val Ser Ala Arg Arg Gly Arg Glu He His Leu Gly Pro Ala Asp Ser 
195 200 205 

ctt gaa ggg cag ggg tgg cga etc etc gcg cct att acg gee tac tec 2473 
Leu Glu Gly Gin Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ser 
210 215 220 

caa cag acg cga ggc eta ctt ggc tgc ate ate act age etc aca ggc 2521 
Gin Gin Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly 
225 230 235 240 

egg gac agg aac cag gtc gag ggg gag gtc caa gtg gtc tec acc gca 2569 
Arg Asp Arg Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala 

245 250 255 

aca caa tct ttc ctg gcg acc tgc gtc aat ggc gtg tgt tgg act gtc 2617 
Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val 

260 265 270 

tat cat ggt gee ggc tea aag acc ctt gee. ggc cca aag ggc cca ate 2665 
Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro He 
275' 280 285 

acc caa atg tac acc aat gtg gac cag gac etc gtc ggc tgg caa gcg 2713 
Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin Ala 
290 295 300 

ccc ccc ggg gcg cgt tec ttg aca cca tgc acc tgc ggc age teg gac 2761 
Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp 
305 310 315 320 

ctt tac ttg gtc acg aag cat gee gat gtc att ccg gtg cgc egg egg 2809 
Leu Tyr Leu Val Thr Lys His Ala Asp Val He Pro Val Arg Arg Arg 

325 330 335 

ggc gac age agg ggg age eta etc tec ccc egg ccc gtc tec tac ttg 2857 
Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu 

340 345 350 

aag ggc tct teg ggc ggt cca ctg etc tgc ccc teg ggg cac get gtg 2905 
Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ser Gly His Ala Val 
355 360 365 

ggc ate ttt egg get gec gtg tgc acc cga ggg gtt gcg aag gcg gtg 2953 
Gly He Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val 
370 375 380 
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gac ttt gta ccc gtc gag tct atg gaa acc act atg egg tec ccg gtc 3001 
Asp Phe Val Pro Val Glu Ser Met Glu Thr Thr Met Arg Ser Pro Val 
385 390 395 400 

ttc acg gac aac teg tec cct ccg gee gta ccg cag aca ttc cag gtg 3049 
Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Thr Phe Gin Val 

405 410 415 

gec cat eta cac gee cct act ggt age ggc aag age act aag gtg ccg 3097 
Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

420 425 430 

get gcg tat gca gee caa ggg tat aag gtg ctt gtc ctg aac ccg tec 3145 
Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser 
435 440 445 

gtc gee gee acc eta ggt ttc ggg gcg tat atg tct aag gca cat ggt 3193 
Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly 
450 455 460 

ate gac cct aac ate aga acc ggg gta agg acc ate acc acg ggt gec 3241 
lie Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ala 
465 470 475 480 

ccc ate acg tac tec acc tat ggc aag ttt ctt gec gac ggt ggt tgc 3289 
Pro He Thr Tyr Ser. Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys 

485 490 495 

tct ggg ggc gec tat gac ate at a ata tgt gat gag tgc cac tea act 3337 
Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr 

500 505 510 

gac teg acc act ate ctg ggc ate ggc aca gtc ctg gac caa gcg gag 3385 
Asp Ser Thr Thr He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu 
515 520 525 

acg get gga gcg cga etc gtc gtg etc gee acc get acg cct ccg gga 3433 
Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 
530 535 540 

teg gtc acc gtg cca cat cca aac ate gag gag gtg get ctg tec age 3481 
Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu* Ser Ser 
545 550 555 560 

act gga gaa ate ccc ttt tat ggc aaa gee ate ccc ate gag acc ate 3529 
Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro He Glu Thr He 

565 570 575 

aag ggg ggg agg cac etc att ttc tgc cat tec aag aag aaa tgt gat 3577 
Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp 

580 585 590 

gag etc gee gcg aag ctg tec ggc etc gga etc aat get gta gca tat 3625 
Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly Leu Asn Ala Val Ala Tyr 
595 600 605 
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tac egg ggc ctt gat gta tec gtc ata cca act age gga gac gtc att 3673 
Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val He 
610 615 620 

gtc gta gca acg gac get eta atg acg ggc ttt ace ggc gat ttc gac 3721 
Val Val Ala Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp 
625 630 635 640 

tea gtg ate gac tgc aat aca tgt gtc acc cag aca gtc gac ttc age 3769 
Ser Val He Asp. Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser 

645 650 655 

ctg gac ccg acc ttc acc att gag acg acg acc gtg cca caa gac gcg 3817 
Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala 

660 665 670 

gtg tea cgc teg cag egg cga ggc agg act ggt agg ggc agg atg ggc 3865 
Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Met Gly 
675 680 685 

att tac agg ttt gtg act cca gga gaa egg ccc teg ggc atg ttc gat 3913 
He Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp 
690 695 700 

tec teg gtt ctg tgc gag tgc tat gac gcg ggc tgt get tgg tac gag 3961 
Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu 
705 710 715 720 

etc acg ccc gee gag acc tea gtt agg ttg egg get tac eta aac aca 4009 
Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr 

725 730 735 

cca ggg ttg ccc gtc tgc cag gac cat ctg gag ttc tgg gag ggc gtc 4057 
Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val 

740 745 750 

ttt aca ggc etc acc cac ata gac gee cat ttc ttg tec cag act aag 4105 
Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys 
755 760 765 



cag gca gga 
Gin Ala Gly 
770 

tgc gee agg 
Cys Ala Arg 
785 

tgt etc ata 
Cys Leu He 



tat agg ctg 
Tyr Arg Leu 



gac aac ttc 
Asp Asn Phe 



get cag get 
Ala Gin Ala 
790 

egg eta aag 
Arg Leu Lys 
805 

gga gee gtt 
Gly Ala Val 
820 



ccc tac ctg 
Pro Tyr Leu 
775 

cca cct cca 
Pro Pro Pro 



cct acg ctg 
Pro Thr Leu 



caa aac gag 
Gin Asn Glu 
825 



gta gca tac 
Val Ala Tyr 
780 

teg tgg gac 
Ser Trp Asp 
795 

cac ggg cca 
His Gly Pro 
810 

gtt act acc 
Val Thr Thr 



cag get acg 
Gin Ala Thr 



caa atg tgg 
Gin Met Trp 



acg ccc ctg 
Thr Pro Leu 
815 

aca cac ccc 
Thr His Pro 
830 



gtg 4153 
Val 



aag 4201 

Lys 

800 

ctg 4249 
Leu 



ata 4297 
He 
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acc aaa tac ate atg gca tgc atg teg get gac ctg gag gtc gtc acg 4345 
Thr Lys Tyr lie Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr 
835 840 845 

age acc tgg gtg ctg gta ggc gga gtc eta gca get ctg gee gcg tat 4393 
Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr 
850 855 860 

tgc ctg aca aca ggc age gtg gtc att gtg ggc agg ate ate ttg tec 4441 
Cys Leu Thr Thr Gly Ser Val Val lie Val Gly Arg lie lie Leu Ser 
865 870 875 880 

gga agg ccg gee ate att ccc gac agg gaa gtc ctt tac egg gag ttc 4489 
Gly Arg Pro Ala lie lie Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe 

885 890 895 

gat gag atg gaa gag tgc gee tea cac etc cct tac ate gaa cag gga 4537 
Asp. Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr lie Glu Gin Gly 

900 * 905 910 

atg cag etc gee gaa caa ttc aaa cag aag gca ate ggg ttg ctg caa 4585 
Met Gin Leu Ala Glu Gin Phe Lys Gin Lys Ala He Gly Leu Leu Gin 
915 920 925 

aca gec acc aag caa gcg gag get get get ccc gtg gtg gaa tec aag 4633 
Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys 
930 935 940 

tgg egg acc etc gaa gee ttc tgg gcg aag cat atg tgg aat ttc ate 4681 
Trp Arg Thr Leu Glu Ala Phe Trp Ala Lys His Met Trp Asn Phe He 
945 950 955 960 

age ggg ata caa tat tta gca ggc ttg tec act ctg cct ggc aac ccc 4729 
Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro 

965 970 975 

gcg ata gca tea ctg atg gca ttc aca gee tct ate acc age ccg etc 4777 

Ala He Ala Ser Leu Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu 

980 985 990 

i 

acc acc caa cat acc etc ctg ttt aac ate ctg ggg gga tgg gtg gec 4825 

Thr Thr Gin His Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala 
995 1000* 1005 

gee caa ctt get cct ccc age get get tec get ttc gta ggc gee ggc 4873 
Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly 
1010 1015 1020 

ate get gga gcg get gtt ggc age ata ggc ctt ggg aag gtg ctt gtg 4921 
He Ala Gly Ala Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val 
1025 1030 1035 1040 



gat att ttg gca ggt tat gga gca ggg gtg gca ggc gcg etc gtg gee 4969 
Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala 

1045 1050 1055 
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ttt aag gtc atg age ggc gag atg ccc tec acc gag gac ctg gtt aac 5017 
Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn 

1060 1065 1070 

eta etc cct get ate etc tec cct ggc gee eta gtc gtc ggg gtc gtg 5065 
Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Val Gly Val Val 
1075 1080 1085 

tgc gca gcg ata ctg cgt egg cac gtg ggc cca ggg gag ggg get gtg 5113 
Cys Ala Ala lie Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val 
1090 1095 1100 

cag tgg atg aac egg ctg ata gcg ttc get teg egg ggt aac cac gtc 5161 
Gin Trp Met Asn Arg Leu lie Ala Phe Ala Ser Arg Gly Asn His Val 
1105 1110 1115 1120 

tec ccc acg cac tat gtg cct gag age gac get gca gca cgt gtc act 5209 
Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr 

1125 1130 1135 

cag ate etc tct agt ctt acc ate act cag ctg ctg aag agg ctt cac 5257 
Gin lie Leu Ser Ser Leu Thr lie Thr Gin Leu Leu Lys Arg Leu His 

1140 1145 1150 

cag tgg ate aac gag gac tgc tec acg cca tgc tec ggc teg tgg eta 5305 
Gin Trp lie Asn Glu Asp Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu 
1155 1160 1165 

aga gat gtt tgg gat tgg ata tgc acg gtg ttg act gat ttc aag gec 5353 
Arg Asp Val Trp. Asp Trp lie Cys Thr Val Leu Thr Asp Phe Lys Ala 
1170 1175 1180 

tgg etc cag tec aag etc ctg ccg cga ttg ccg gga gtc ccc ttc ttc 5401 
Trp Leu Gin Ser Lys Leu Leu Pro Arg Leu Pro Gly Val Pro Phe Phe 
1185 1190 1195 1200 

tea tgt caa cgt ggg tac aag gga gtc tgg egg ggc gac ggc ate atg 5449 
Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly lie Met 

1205 1210 1215 

caa acc acc tgc cca tgt gga gca cag ate acc gga cat gtg aaa aac 5497 
Gin Thr Thr Cys Pro Cys Gly Ala Gin lie Thr Gly His Val Lys Asn 

1220 1225 1230 

tgt tec atg agg ate gtg ggg cct agg acc tgt agt aac acg tgg cat 5545 
Cys Ser Met Arg lie Val Gly Pro Arg Thr Cys Ser Asn Thr Trp His 
1235 1240 1245 

gga aca ttc ccc att aac gcg tac acc acg ggc ccc tgc acg ccc tec 5593 
Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser 
1250 1255 1260 

ccg gcg cca aat tat tct agg gcg ctg tgg egg gtg get get gag gag 5641 
Pro Ala Pro Asn Tyr Ser Arg Ala Leu Trp Arg Val Ala Ala Glu Glu 
1265 1270 1275 1280 
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tac gtg gag gtt acg cga gtg ggg gat ttc cac tac gtg acg ggc atg 5689 
Tyr Val Glu Val Thr Arg Val Gly Asp Phe His Tyr Val Thr Gly Met 

1285 1290 1295 

acc act gac aac gta aag tgc ccg tgt cag gtt ccg gcc ccc gaa ttc 5737 
Thr Thr Asp Asn Val Lys Cys Pro Cys Gin Val Pro Ala Pro Glu Phe 

1300 1305 1310 

ttc aca gaa gtg gat ggg gtg egg ttg cac agg tac get cca gcg tgc 5785 
Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys 
1315 1320 1325 

aaa ccc etc eta egg gag gag gtc aca ttc ctg gtc ggg etc aat caa 5833 
Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Leu Val Gly Leu Asn Gin 
1330 1335 1340 

tac ctg gtt ggg tea cag etc cca tgc gag ccc gaa ctg gac gta gca 5881 
Tyr Leu Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Leu Asp Val Ala 
1345 1350 1355 1360 

gtg etc act tec atg etc acc gac, ccc tec cac att acg gcg gag acg 5929 
Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Thr 

1365 1370 1375 

get aag cgt agg ctg gcc agg gga tct ccc ccc tec ttg gcc age tea 5977 
Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser Ser 

1380 1385 1390 

tea get age cag ctg tct gcg cct tec ttg aag gca aca tgc act acc 6025 
Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Thr 
1395 1400 1405 

cgt cat gac tec ccg gac get gac etc ate gag gcc aac etc ctg tgg 6073 
Arg His Asp Ser Pro Asp Ala Asp Leu He Glu Ala Asn Leu Leu Trp 
1410 1415 1420 

egg cag gag atg ggc ggg aac ate acc cgc gtg gag tea gaa aat aag 6121 
Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn Lys 
1425 1430 1435 1440 

gta gta att ttg gac tct ttc gag ccg etc caa gcg gag gag gat gag 6169 
Val Val He Leu Asp Ser Phe Glu Pro Leu Gin Ala Glu Glu Asp Glu 

1445 1450 1455 

agg gaa gta tec gtt ccg gcg gag ate ctg egg agg tec agg aaa ttc 6217 
Arg Glu Val Ser Val Pro Ala Glu He Leu Arg Arg Ser Arg Lys Phe 

1460 1465 1470 

cct cga gcg atg ccc ata tgg gca cgc ccg gat tac aac cct cca ctg 6265 
Pro Arg Ala Met Pro He Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu 
1475 1480 1485 

ttg gag tec tgg aag gac ccg gac tac gtc cct cca gtg gta cac ggg 6313 
Leu Glu Ser Trp Lys Asp Pro Asp Tyr Val Pro Pro Val Val His Gly 
1490 1495 1500 
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tgt cca ttg ccg cct gcc aag gcc cct ccg ata cca cct cca egg agg 6361 
Cys Pro Leu Pro Pro Ala Lys Ala Pro Pro lie Pro Pro Pro Arg Arg 
1505 1510 1515 1520 

aag agg acg gtt gtc ctg tea gaa tct ace gtg tct tct gcc ttg gcg 6409 
Lys Arg Thr Val Val Leu Ser Glu Ser Thr Val Ser Ser Ala Leu Ala 

1525 1530 1535 

gag etc gcc aca aag acc ttc ggc age tec gaa teg teg gcc gtc gac 6457 
Glu Leu Ala Thr Lys Thr Phe Gly Ser Ser Glu Ser Ser Ala Val Asp 

1540 1545 1550 

age ggc acg gca acg gcc tct cct gac cag ccc tec gac gac ggc gac 6505 
Ser Gly Thr Ala Thr Ala Ser Pro Asp Gin Pro Ser Asp Asp Gly Asp 
1555 1560 1565 

gcg gga tec gac gtt gag teg tac tec tec atg ccc ccc ctt gag ggg 6553 
Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly 
1570 1575 1580 

gag ccg ggg gat ccc gat etc age gac ggg tct tgg tct acc gta age 6601 
Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser 
1585 1590 1595 1600 

gag gag get agt gag gac gtc gtc tgc tgc teg atg tec tac aca tgg 6649 
Glu Glu Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp 

1605 1610 1615 

acg ggc gcc ctg ate acg cca tgc get gcg gag gaa acc aag ctg ccc 6697 
Thr Gly Ala Leu lie Thr Pro Cys Ala Ala Glu Glu Thr Lys Leu Pro 

1620 1625 1630 

ate aat gca ctg age aac tct ttg etc cgt cac cac aac ttg gtc tat 6745 
lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr 
1635 1640 1645 

get aca aca tct cgc age gca age ctg egg cag aag aag gtc acc ttt 6793 
Ala Thr Thr Ser Arg Ser Ala Ser Leu Arg Gin Lys Lys Val Thr Phe 
1650 1655 1660 

gac aga ctg cag gtc ctg gac gac cac tac egg gac gtg etc aag gag 6841 
Asp Arg Leu Gin Val Leu Asp Asp His Tyr Arg Asp Val Leu Lys Glu 
1665 1670 1675 1680 

atg aag gcg aag gcg tec aca gtt aag get aaa ctt eta tec gtg gag 6889 

Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu Leu Ser Val Glu 

1685 1690 1695 

* 

gaa gcc tgt aag ctg acg ccc cca cat teg gcc aga tct aaa ttt ggc 6937 

Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Arg Ser Lys Phe Gly 

1700 1705 1710 

tat ggg gca aag gac gtc egg aac eta tec age aag gcc gtt aac cac 6985 
Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys Ala Val Aen His 
1715 1720 1725 
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ate cgc tec gtg tgg aag gac ttg ctg gaa gac act gag aca cca att 7033 
lie Arg Ser Val Trp Lys Asp Leu Leu Glu Asp Thr Glu Thr Pro lie 
1730 1735 1740 

gac acc acc ate atg gca aaa aat gag gtt ttc tgc gtc caa cca gag 7081 
Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu 
1745 1750 1755 1760 

aag ggg ggc cgc aag cca get cgc ctt ate gta ttc cca gat ttg ggg 7129 
Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp Leu Gly 

1765 1770 1775 

gtt cgt gtg tgc gag aaa atg gec ctt tac gat gtg gtc tec acc etc 7177 
Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Thr Leu 

1780 1785 1790 

■ 

cct cag gec gtg atg ggc tct tea tac gga ttc caa tac tct cct gga 7225 
Pro Gin Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly 
1795 1800 1805 

cag egg gtc gag ttc ctg gtg aat gee tgg aaa gcg aag aaa tgc cct 7273 
Gin Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ala Lys Lys Cys Pro 
1810 1815 1820 

atg ggc ttc gca tat gac acc cgc tgt ttt gac tea acg gtc act gag 7321 
Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1825 1830 ' 1835 1840 

aat gac ate cgt gtt gag gag tea ate tac caa tgt tgt gac ttg gee 7369 
Asn Asp He Arg Val Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Ala 

1845 1850 1855 

ccc gaa gee aga cag gec ata agg teg etc aca gag egg ctt tac ate 7417 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr .Glu Arg Leu Tyr He 

1860 1865 1870 

ggg ggc ccc ctg act aat tct aaa ggg cag aac tgc ggc tat cgc egg 7465 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 
1875 1880 1885 

tgc cgc gcg age ggt gta ctg acg acc age tgc ggt aat acc etc aca 7513 
Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 
1890 1895 1900 

tgt tac ttg aag gee get gcg gee tgt cga get gcg aag etc cag gac 7561 
Cys Tyr Leu Lys Ala Ala Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
1905 1910 1915 . 1920 

tgc acg atg etc gta tgc gga gac gac ctt gtc gtt ate tgt gaa age 7609 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

1925 1930 1935 

9 C 9 999 ac c caa gag gac gag gcg age eta egg gee ttc acg gag get 7657 
Ala Gly Thr Gin Glu Asp Glu Ala Ser Leu Arg Ala Phe Thr Glu Ala 

1940 1945 1950 
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atg act aga tac tct gcc ccc cct ggg gac ccg ccc aaa cca gaa tac 7705 
Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Lys Pro Glu Tyr 
1955 1960 1965 

gac ttg gag ttg ata aca tea tgc tec tec aat gtg tea gtc gcg cac 7753 
Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 
1970 1975 1980 

gat gca tct ggc aaa agg gtg tac tat etc ace cgt gac ccc acc ace 7801 
Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
1985 1990 1995 2000 

ccc ctt gcg egg get gcg tgg gag aca get aga cac act cca gtc aat 7849 
Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 

2005 2010 2015 

tec tgg eta ggc aac ate ate atg tat gcg ccc acc ttg tgg gca agg 7897 
Ser Trp Leu Gly Asn lie lie Met Tyr Ala Pro Thr Leu Trp Ala Arg 

2020 2025 2030 

atg ate ctg atg act cat ttc ttc tec ate ctt eta get cag gaa caa 7945 
Met lie Leu Met Thr His Phe Phe Ser lie Leu Leu Ala Gin Glu Gin 
2035 2040 2045 

ctt gaa aaa gcc eta gat tgt cag ate tac ggg gcc tgt tac tec att 7993 
Leu Glu Lys Ala Leu Asp Cys Gin lie Tyr Gly Ala Cys Tyr Ser lie 
2050 2055 2060 

gag cca ctt gac eta cct cag ate att caa cga etc cac ggc ctt age' 8041 
Glu Pro Leu Asp Leu Pro Gin lie lie Gin Arg Leu His Gly Leu Ser 
2065 2070 2075 2080 

gca ttt tea etc cat agt tac tct cca ggt gag ate aat agg gtg get 8089 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu lie Asn Arg Val Ala 

2085 2090 2095 

tea tgc etc agg aaa ctt ggg gta ccg ccc ttg cga gtc tgg aga cat 8137 
Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

2100 2105 2110 

egg gcc aga agt gtc cgc get agg eta ctg tec cag ggg ggg agg get 8185 
Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ser Gin Gly Gly Arg Ala 
2115 2120 2125 

gcc act tgt ggc aag tac etc ttc aac tgg gca gta agg acc aag etc 8233 
Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 
2130 2135 2140 

aaa etc act cca ate ccg get gcg tec cag ttg gat tta tec age tgg 8281 
Lys Leu Thr Pro lie Pro Ala Ala Ser Gin Leu Asp Leu Ser Ser Trp 
2145 2150 2155 2160 

ttc gtt get ggt tac age ggg gga gac ata tat cac age ctg tct cgt 8329 
Phe Val Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser Arg 

2165 2170 2175 



i 
\ 
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gcc cga ccc cgc tgg ttc atg tgg tgc eta etc eta ctt tct gta ggg 8377 
Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly 

2180 2185 2190 

gta ggc ate tat eta etc ccc aac cga tga aeggggaget aaacactcca 8427 
Val Gly lie Tyr Leu Leu Pro Asn Arg * 
2195 • 2200 

ggecaatagg ccatcctgtt ttttcccttt tttttttttt 
tttttttttt tttttttttt ttttcttttt tcccaatttt 
tggctccatc ttagecctag teaeggctag ctgtgaaagg 
agagagtget gatactggee tetctgeaga tcaagt 

* 

<210> 5 
<21£> 8648 
<212> DNA 
<213> HCV 

<220> 
<221> CDS 

<222> (1802) . . . (8407) 
<400> 5 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 
tcttcacgca gaaagegtet agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 
cccccctccc gggagageca tagtggtctg eggaaceggt gagtacaccg gaattgccag 180 
gaegaceggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgagactgc tagecgagta gtgttgggtc gegaaaggee ttgtggtact gectgatagg 300 
gtgettgega gtgccccggg aggtctegta gaccgtgcac catgagcacg aatcctaaac 360 
ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgeae gcaggttctc 42 0 
cggccgcttg ggtggagagg etattegget atgactgggc acaacagaca ateggctget 480 
ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 
acctgtccgg tgccctgaat gaactgeagg acgaggcagc geggctateg tggctggcca 600 
egaegggegt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 
tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 
aagtatccat catggctgat geaatgegge ggctgeatae gcttgatccg gctacctgcc 780 
cattcgacca ecaagegaaa catcgcatcg agegagcacg tacteggatg gaagccggtc 840 
ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 
ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggegatgect 960 
gettgecgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 
tgggtgtggc ggacegctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 
ttggcggcga atgggctgac cgcttcctcg tgetttaegg tatcgccgct cccgattcgc 1140 
agcgcatcgc cttctatcgc cttcttgacg agttcttctg agttcgcgcc cagatgttaa 1200 
cagaccacaa cggtttccct etagegggat caattccgcc ccccccccta acgttactgg 1260 
ccgaagccgc ttggaataag gccggtgtgc gtttgtctat atgttatttt ccaccatatt 1320 
geegtctttt ggcaatgtga gggcccggaa acctggccct gtcttcttga cgagcattcc 1380 
taggggtctt tcccctctcg ccaaaggaat gcaaggtctg ttgaatgtcg tgaaggaagc 1440 
agttcctctg gaagcttctt gaagacaaac aacgtctgta gcgacccttt geaggcageg 1500 
gaacccccca cctggcgaca ggtgcctctg eggecaaaag ccacgtgtat aagatacacc 1560 
tgcaaaggcg gcacaacccc agtgccacgt tgtgagttgg atagttgtgg aaagagtcaa 1620 
atggctctcc teaagegtat tcaacaaggg gctgaaggat geccagaagg taccccattg 1680 
tatgggatct gatctggggc ctcggtgcac atgetttaca tgtgtttagt cgaggttaaa 174 0* 
aaaegtctag gccccccgaa ccacggggac gtggttttcc tttgaaaaac acgataatac 1900 
c atg gac egg gag atg gca gca teg tgc gga ggc gcg gtt ttc gta ggt 1849 
Met Asp Arg Glu Met ' Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly 
15 10 15 



tttttttttt tttttttttt 8487 
tttccttttc tttcctttgg 8547 
tccgtgagcc gettgactge 8607 

8643 
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ctg ata etc ttg acc ttg tea ccg cac tat aag ctg ttc etc get agg 1897 
Leu lie Leu Leu Thr Leu Ser Pro His Tyr Lys Leu Phe Leu Ala Arg 

20 25 30 

etc ata tgg tgg tta caa tat ttt ate acc agg gec gag gca cac ttg 1945 
Leu lie Trp Trp Leu Gin Tyr Phe lie Thr Arg Ala Glu Ala His Leu 
35 40 45 

caa gtg tgg ate ccc ccc etc aac gtt egg ggg ggc cgc gat gee gtc 1993 
Gin Val Trp He Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val 
50 55 60 

ate etc etc acg tgc gcg ate cac cca gag eta ate ttt acc ate acc 2041 
He Leu Leu Thr Cys Ala He His Pro Glu Leu He Phe Thr He Thr 
65 70 75 80 

i 

aaa ate ttg etc gee ata etc ggt cca etc atg gtg etc cag get ggt 2089 
Lys He Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala Gly 

85 90 95 

ata acc aaa gtg ccg tac ttc gtg cgc gca cac ggg etc att cgt gca 2137 
He Thr Lys Val Pro Tyr Phe Val Arg Ala His Gly Leu He Arg Ala 

100 105 110 

tgc atg ctg gtg egg aag gtt get ggg ggt cat tat gtc caa atg get 2185 
Cys Met Leu Val Arg Lys Val Ala Gly Gly His Tyr Val Gin Met Ala 
115 120 125 

etc atg aag ttg gee gca ctg aca ggt acg tac gtt tat gac cat etc 2233 
Leu Met Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu 
130 135 140 

acc cca ctg egg gac tgg gee cac gcg ggc eta cga gac ctt gcg gtg 2281 
Thr Pro Leu Arg Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val 
145 . 150 155 160 

gca gtt gag ccc gtc gtc ttc tct gat atg gag acc aag gtt ate acc 2329 
Ala Val Glu Pro Val Val Phe Ser Asp Met Glu Thr Lys Val He Thr 

165 170 175 

tgg ggg gca gac acc gcg gcg tgt ggg gac ate ate ttg ggc ctg ccc 2377 
Trp Gly Ala Asp Thr Ala Ala Cys Gly Asp He He Leu Gly Leu Pro 

180 185 190 

gtc tec gee cgc agg ggg agg gag ata cat ctg gga ccg gca gac age 2425 
Val Ser Ala Arg Arg Gly Arg Glu He His Leu Gly Pro Ala Asp Ser 
195 200 205 

ctt gaa ggg cag ggg tgg cga etc etc gcg cct att acg gee tac tec 2473 
Leu Glu Gly Gin Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ser 
210 215 220 

caa cag acg cga ggc eta ctt ggc tgc ate ate act age etc aca ggc 2521 
Gin Gin Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly 
225 230 235 240 
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egg gac agg aac cag gtc gag ggg gag gtc caa gtg gtc tec acc gca 2569 

Arg Asp Arg Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala 

245 250 255 

aca caa tct ttc ctg gcg acc tgc gtc aat ggc gtg tgt tgg act gtc 2617 
Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val 

260 265 270 

tat cat ggt gec ggc tea aag acc ctt gec ggc cca aag ggc cca ate 2665 
Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro lie 
275 280 285 

acc caa atg tac acc aat gtg gac cag gac etc gtc ggc tgg caa gcg 2713 
Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin Ala 
290 295 300 

ccc ccc ggg gcg cgt tec ttg aca cca tgc acc tgc ggc age teg gac 2761 
Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp 
305 310 315 320 

ctt tac ttg gtc acg agg cat gee gat gtc att ccg gtg cgc egg egg 2809 
Leu Tyr Leu Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg 

325 330 335 

ggc gac age agg ggg age eta etc tec ccc agg ccc gtc tec tac ttg 2857 
Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu 

340 345 350 

aag ggc tct teg ggc ggt cca ctg etc tgc ccc teg ggg cac get gtg 2905 
Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ser Gly His Ala Val 
355 360 365 

ggc ate ttt egg get gee gtg tgc acc egg ggg gtt gcg aag gcg gtg 2953 
Gly He Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val 
370 375 380 

gac ttt gta ccc gtc gag tct atg gga acc act atg egg tec ccg gtc 3001 
Asp Phe Val Pro Val Glu Ser Met Gly Thr Thr Met Arg Ser Pro Val 
385 390 395 400 

ttc acg gac aac teg tec cct ccg gee gta ccg cag aca ttc cag gtg 3049 
Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Thr Phe Gin Val 

405 410 415 

gee cat eta cac gec cct act ggt age ggc aag age act aag gtg ccg 3097 
Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

420 425 430 

get gcg tat gca gec caa ggg tat aag gtg ctt gtc ctg aac ccg tec 3145 
Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser 
435 ' 440 445 

gtc gee gec acc eta ggt ttc ggg gcg tat atg tct aag gca cat ggt 3193 
Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly 
450 455 460 
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ate gac cct aac ate aga acc ggg gta agg acc ate ace acg ggt gee 3241 
lie Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ala 
465 470 475 480 

ccc ate acg tac tec acc tat ggc aag ttt ctt gee gac ggt ggt tgc 3289 
Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys 

485 490 495 

tct ggg ggc gec tat gac ate at a ata tgt gat gag tgc cac tea act 3337 
Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr 

500 505 510 

gac teg acc act ate ctg ggc ate ggc aca gtc ctg gac caa gcg gag 3385 
Asp Ser Thr Thr He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu 
515 520 525 

acg get gga gcg cga etc gtc gtg etc gee acc get acg cct ccg gga 3433 
Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 
530 535 540 

teg gtc acc gtg cca cat cca aac ate gag gag gtg get ctg tec age 3481 
Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser Ser 
545 550 555 560 

a ct gga gaa ate ccc ttt tat ggc aaa gee ate ccc ate gag acc ate 3529 
Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro He Glu Thr He 

565 570 575 

aag ggg ggg agg cac etc att ttc tgc cat tec aag aag aaa tgt gat 3577 
Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp 

580 585 590 

gag etc gee gcg aag ctg tec ggc etc gga etc aat get gta gca tat 3625 
Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly Leu Asn Ala Val Ala Tyr 
595 600 605 

tac egg ggc ctt gat gta tec gtc ata cca act age gga gac gtc att 3673 
Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val He 
610 615 620 

gtc gta gca acg gac get eta atg acg ggc ttt acc ggc gat ttc gac 3721 
Val Val Ala Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp 
625 630 635 640 

tea gtg ate gac tgc aat aca tgt gtc acc cag aca gtc gac ttc age 3769 
Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser 

645 650 655 

ctg gac ccg acc ttc acc att gag acg acg acc gtg cca caa gac gcg 3817 
Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala 

660 665 670 

gtg tea cgc teg cag egg cga ggc agg act ggt agg ggc agg atg ggc 3865 
Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Met Gly 
675 680 685 
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att tac agg ttt gtg act cca gga gaa egg ccc teg ggc atg ttc gat 3913 

lie Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp 

690 695 700 

tec teg gtt ctg tgc gag tgc tat gac gcg ggc tgt get tgg tac gag 3961 

Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu 
705 710 715 720 

etc acg ccc gee gag acc tea gtt agg ttg egg get tac eta aac aca 4009 

Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr 

725 730 735 

cca ggg ttg ccc gtc tgc cag gac cat ctg gag ttc tgg gag age gtc 4057 

Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ser Val 

740 745 750 

ttt aca ggc etc acc cac ata gac gee cat ttc ttg tec cag act aag 4105 

Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys 

755 760 765 

cag'gea gga gac aac ttc ccc tac ctg gta gca tac cag get acg gtg 4153 

Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 

770 775 780 

tgc gee agg get cag get cca cct cca teg tgg gac caa atg tgg aag 4201 

Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys 
785 790 795 800 

tgt etc ata egg eta aag cct acg ctg cac ggg cca acg ccc ctg ctg 424 9 

Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu 

805 810 815 

tat agg ctg gga gec gtt caa aac gag gtt act acc aca cac ccc ata 4297 

Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Thr Thr His Pro He 

820 825 830 

acc aaa tac ate atg gca tgc atg teg get gac ctg gag gtc gtc acg 4345 

Thr Lys Tyr He Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr 

835 840 845 

age acc tgg gtg ctg gta ggc gga gtc eta gca get ctg gee gcg tat 4393 

Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala' Leu Ala Ala Tyr 

850 855 860 

tgc ctg aca aca ggc age gtg gtc att gtg ggc agg ate ate ttg tec 4441 

Cys Leu Thr Thr Gly Ser Val Val He Val Gly Arg He He Leu Ser 
865 870 875 880 

gga aag ccg gee ate att ccc gac agg gaa gtc ctt tac egg gag ttc 4489 

Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe 

885 890 895 

gat gag atg gaa gag tgc gec tea cac etc cct tac ate gaa cag gga 4537 

Asp Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly 

900 905 910 
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atg cag etc gec gaa caa ttc aaa cag aag gca ate ggg ttg ctg caa 4585 
Met Gin Leu Ala Glu Gin Phe Lys Gin Lys Ala lie Gly Leu Leu Gin 
915 920 925 

aca gec acc aag caa gcg gag get get get ccc gtg gtg gaa tec aag 4633 
Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys 
930 935 940 

tgg egg acc etc gaa gec ttc tgg gcg aag cat atg tgg aat ttc ate 4681 
Trp Arg Thr Leu Glu Ala Phe Trp Ala Lys His Met Trp Asn Phe lie 
945 950 955 960 

age ggg ata caa tat tta gca ggc ttg tec act ctg cct ggc aac ccc 4729 
Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro 

965 970 975 

gcg ata gca tea ctg atg gca ttc 'aca gee tct ate acc age ccg etc 4777 
Ala He Ala Ser Leu Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu 

980 985 990 

acc acc caa cat acc etc ctg ttt aac ate ctg ggg gga tgg gtg gee 4825 
Thr Thr Gin His Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala 
995 1000 1005 

gec caa ctt get cct ccc age get get tct get ttc gta ggc gee ggc 4873 
Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly 
1010 1015 1020 

ate get gga gcg get gtt ggc age ata ggc ctt ggg aag gtg ctt gtg 4921 
He Ala Gly Ala Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val 
1025 1030 1035 1040 

gat att ttg gca ggt tat gga gca ggg gtg gca ggc gcg etc gtg gee 4969 
Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala 

1045 1050 1055 

ttt aag gtc atg age ggc gag atg ccc tec acc gag gac ctg gtt aac 5017 
Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn 

1060 • 1065 1070 

eta etc cct get ate etc tec cct ggc gee eta gtc gtc ggg gtc gtg 5065 
Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val 
• 1075 1080 1085 

tgc gca gcg ata ctg cgt egg cac gtg ggc cca ggg gag ggg get gtg 5113 
Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val 
1090 1095 1100 

cag tgg atg aac egg ctg ata gcg ttc get teg egg ggt aac cac gtc 5161 
Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val 
1105 1110 1115 1120 

tec ccc acg cac tat gtg cct gag age gac get gca gca cgt gtc act 5209 
Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr 

1125 1130 1135 
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cag ate etc tct agt ctt acc ate act cag ctg ctg aag agg ctt cac 5257 
Gin He Leu Ser Ser Leu Thr He Thr Gin Leu Leu Lys Arg Leu His 

1140 1145 1150 

cag tgg ate aac gag gac tgc tec acg cca tgc tec ggc teg tgg eta 5305 
Gin Trp He Asn Glu Asp Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu 
1155 1160 1165 

aga gat gtt tgg gat tgg gta tgc acg gtg ttg act gat ttc aag acc 5353 
Arg Asp Val Trp Asp Trp Val Cys Thr Val Leu Thr Asp Phe Lys Thr 
1170 1175 1180 

tgg etc cag tec aag etc ctg ccg cga ttg ccg gga gtc ccc ttc ttc 5401 
Trp Leu Gin Ser Lys Leu Leu Pro Arg Leu Pro Gly Val Pro Phe Phe 
1185 1190 1195 1200 

tea tgt caa cgt ggg tac aag gga gtc tgg egg ggc gac ggc ate atg 5449 
Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He Met 

1205 1210 1215 

caa acc acc tgc cca tgt gga gca cag ate acc gga cat gtg aaa aac 5497 
Gin Thr Thr Cys Pro Cys Gly Ala Gin He Thr Gly His Val Lys Asn 

1220 1225 1230 

tgt tec atg agg ate gtg ggg cct agg acc tgt agt aac acg tgg cat 5545 
Cys Ser Met Arg He Val Gly Pro Arg Thr Cys Ser Asn Thr Trp His 
1235 1240 1245 

gga aca ttc ccc att aac gcg tac acc acg ggc ccc tgc acg ccc tec 5593 
Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser 
1250 1255 1260 

ccg gcg cca aat tat tct agg gcg ctg tgg egg gtg get get gag gag 5641 
Pro Ala Pro Asn Tyr Ser Arg Ala Leu Trp Arg Val Ala Ala Glu Glu 
1265 1270 1275 1280 

tac gtg gag gtt acg egg gtg ggg gat ttc cac tac gtg acg ggc atg 5689 
Tyr Val Glu Val Thr Arg Val Gly Asp Phe His Tyr Val Thr Gly Met 

1285 1290 1295 

» 

acc act gac aac gta aag tgc ccg tgt cag gtt ccg gec ccc gaa ttc 5737 
Thr Thr Asp Asn Val Lys Cys Pro Cys Gin Val Pro Ala Pro Glu Phe 

1300 1305 1310 

■ 

ttc aca gaa gtg gat ggg gtg egg ttg cac agg tac get cca gcg tgc 5785 
Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys 
1315 1320 1325 

aaa ccc etc eta egg gag gag gtc aca ttc ctg gtc ggg etc aat caa 5833 
Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Leu Val Gly Leu Asn Gin 
1330 1335 1340 

tac ctg gtt ggg tea cag etc cca tgc gag ccc gaa ccg gac gta gca 5881 
Tyr Leu Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala 
1345 1350 1355 1360 
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gtg etc act tec atg etc acc gac ccc tec cac att acg gcg gag acg 5929 
Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Thr 

1365 1370 1375 

get aag cgt agg ctg gec agg gga tct ccc ccc tec ttg gec age tea 5977 
Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser Ser 

1380 1385 1390 

tea get age cag ctg tct gcg ccc tec ttg aag gca aca tgc act acc 6025 
Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Thr 
1395 1400 1405 

cgt cat gac tec ccg gac get gac etc ate gag gec aac etc ctg tgg 6073 
Arg His Asp Ser Pro Asp Ala Asp Leu lie Glu Ala Asn Leu Leu Trp 
1410 1415 1420 

egg cag gag atg ggc ggg aac ate acc cgc gtg gag tea gaa aat aag 6121 
Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn Lys 
1425 1430 1435 1440 

gta gta att ttg gac tct ttc gag ccg etc caa gcg gag gag gat gag 6169 
Val Val lie Leu Asp Ser Phe Glu Pro Leu Gin Ala Glu Glu Asp Glu 

1445 1450 1455 

agg gaa gta tec gtt ccg gcg gag ate ctg egg agg tec agg aaa ttc 6217 
Arg Glu Val Ser Val Pro Ala Glu lie Leu Arg Arg Ser Arg Lys Phe 

1460 1465 1470 

cct cga gcg atg ccc ata tgg gca cgc ccg gat tac aac cct cca ctg 6265 
Pro Arg Ala Met Pro lie Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu 
1475 1480 1485 

tta gag tec tgg aag gac ccg gac tac gtc cct cca gtg gta cac ggg 6313 
Leu Glu Ser Trp Lys Asp Pro Asp Tyr Val Pro Pro Val Val His Gly 
1490 1495 1500 

tgt cca ttg ccg cct gee aag gee cct ccg ata cca cct cca egg agg 6361 
Cys Pro Leu Pro Pro Ala Lys Ala Pro Pro lie Pro Pro Pro Arg Arg 
1505 1510 1515 1520 

aag agg acg gtt gtc ctg tea gaa tct acc gtg tct tct gee ttg gcg 6409 
Lys Arg Thr Val Val Leu Ser Glu Ser Thr Val Ser Ser Ala Leu Ala 

1525 1530 1535 

■ 

gag etc gec aca aag acc ttc ggc age tec gaa teg teg gee gtc gac 6457 

Glu Leu Ala Thr Lys Thr Phe Gly Ser Ser Glu Ser Ser Ala Val Asp 

1540 1545 1550 

age ggc acg gca acg gee tct cct gac cag ccc tec gac gac ggc gac 6505 
Ser Gly Thr Ala Thr Ala Ser Pro Asp Gin Pro Ser Asp Asp Gly Asp 
1555 1560 1565 

gcg gga tec gac gtt gag teg tac tec tec atg ccc ccc ctt gag ggg 6553 
Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly 
1570 1575 1580 
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gag ccg ggg gat ccc gat etc age gac ggg tct tgg tct acc gta age 6601 

Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser 
1585 1590 1595 1600 

gag gag get agt gag gac gtc gtc tgc tgc teg atg tec tac aca tgg 6649 
Glu Glu Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp 

1605 1610 1615 

aca ggc gec ctg ate acg cca tgc get gcg gag gaa acc aag ctg ccc 6697 
Thr Gly Ala Leu lie Thr Pro Cys Ala Ala Glu Glu Thr Lys Leu Pro 

1620 1625 1630 

ate aat gca ctg age aac tct ttg etc cgt cac cac aac ttg gtc tat 6745 
lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr 
1635 1640 1645 

get aca aca tct cgc age gca age ctg egg cag aag aag gtc acc ttt 6793 
Ala Thr Thr Ser Arg Ser Ala Ser Leu Arg Gin Lys Lys Val Thr Phe 
1650 1655 1660 

gac aga ctg cag gtc ctg gac gac cac tac egg gac gtg etc aag gag 6841 
Asp Arg Leu Gin Val Leu Asp Asp His Tyr Arg Asp Val Leu Lys Glu 
1665 1670 1675 1680 

atg aag gcg aag gcg tec aca gtt aag get aaa ctt eta tec gtg gag 6889 
Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu Leu Ser Val Glu 

1685 1690 1695 

gaa gee tgt aag ctg acg ccc cca cat teg gec aga tct aaa ttt ggc 6937 
Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Arg Ser Lys Phe Gly 

1700 1705 1710 

tat ggg gca aag gac gtc egg aac eta tec age aag gee gtt aac cac 6985 
Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys Ala Val Asn His 
1715 1720 1725 

ate cgc tec gtg tgg aag gac ttg ctg gaa gac act gag aca cca att 7033 
lie Arg Ser Val Trp Lys Asp Leu Leu Glu Asp Thr Glu Thr Pro lie 
1730 1735 1740 

gac acc acc ate atg gca aaa aat gag gtt ttc tgc gtc caa cca gag 7081 
Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu 
1745 1750 1755 1760 

aag ggg ggc cgc aag cca get cgc ctt ate gta ttc cca gat ttg ggg 7129 
Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly 

1765 1770 1775 

gtt cgt gtg tgc gag aaa atg gee ctt tac gat gtg gtc tec acc etc 7177 
Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser. Thr Leu 

1780 1785 1790 



cct cag gee gtg atg ggc tct tea tac gga ttc caa tac tct cct gga 
Pro Gin Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly 
1795 1800 1805 



7225 
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cag egg gtc gag ttc ctg gtg aat get tgg aaa gcg aag aaa tgc cct 7273 
Gin Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ala Lys Lys Cys Pro 
1810 1815 1820 

atg ggc ttc gca tat gac acc cgc tgt ttt gac tea acg gtc act gag 7321 
Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1825 1830 1835 1840 

aat gac ate cgt gtt gag gag tea ate tac caa tgt tgt gac ttg gec 7369 
Asn Asp lie Arg Val Glu Glu Ser lie Tyr Gin Cys Cys Asp Leu Ala 

1845 1850 1855 

ccc gaa gec aga cag gec ata agg teg etc aca gag egg ctt tac ate 7417 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

1860 1865 1870 

999 99 c ccc ctg act aat tct aaa ggg cag aac tgc ggc tat cgc egg 7465 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 
1875 1880 1885 

tgc cgc gcg age ggt gta ctg acg acc age tgc ggt aat acc etc aca 7513 
Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 
1890 1895 1900 

tgt tac ttg aag^gee get gcg gee tgt cga get gcg aag etc cag gac 7561 
Cys Tyr Leu Lys Ala Ala Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
1905 1910 1915 1920 

tgc acg atg etc gta tgc gga gac gac ctt gtc gtt ate tgt gaa age 7609 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

1925 1930 1935 

9 C 9 999 acc caa gag gac gag gcg age eta egg gee ttc acg gag get 7657 
Ala Gly Thr Gin Glu Asp Glu Ala Ser Leu Arg Ala Phe Thr Glu Ala 

1940 1945 1950 

atg act aga tac tct gee ccc cct ggg gac ccg ccc aaa cca gaa tac 7705 
Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Lys Pro Glu Tyr 
1955 I960 1965 

gac ttg gag ttg ata aca tea tgc tec tec aat gtg tea gtc gcg cac 7753 
Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 
1970 1975 1980 

gat gca tct ggc aaa agg gtg tac tat etc acc cgt gac ccc acc acc 7801 
Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
1985 1990 1995 2000 

ccc ctt gcg egg get gcg tgg gag aca get aga cac act cca gtc aat 7849 
Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 

2005 2010 2015 

i 

tec tgg eta ggc aac ate ate atg tat gcg ccc acc ttg tgg gca agg 7897 
Ser Trp Leu Gly Asn He He Met Tyr Ala Pro Thr Leu Trp Ala Arg 

2020 2025 2030 



WO 02/052015 PCT/CA01/01843 



48/93 

atg ate ctg atg act cat ttc ttc tec ate ctt eta get cag gaa caa 7945 
Met He Leu Met Thr His Phe Phe Ser He Leu Leu Ala Gin Glu Gin 
2035 2040 2045 

ctt gaa aaa gec eta gat tgt cag ate tac ggg gec tgt tac tec att 7993 
Leu Glu Lys Ala Leu Asp Cys Gin He Tyr Gly Ala Cys Tyr Ser lie 
2050 2055 2060 

gag cca ctt gac eta cct cag ate att caa cga etc cac ggc ctt age 8041 
Glu Pro Leu Asp Leu Pro Gin He He Gin Arg Leu His Gly Leu Ser 
2065 2070 2075 2080 

gca ttt tea etc cat agt tac tct cca ggt gag ate aat agg gtg get 8089 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala 

2085 2090 2095 

tea tgc etc agg aaa ctt ggg gta ccg ccc ttg cga gtc tgg aga cat 8137 
Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

2100 2105 2110 

egg gec aga agt gtc cgc get agg eta ctg tec cag ggg ggg agg get 8185 
Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ser Gin Gly Gly Arg Ala 
2115 2120 2125 

gee act tgt ggc aag tac etc ttc aac tgg gca gta agg ace aag etc 8233 
Ala. Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 
2130 2135 2140 

aaa etc act cca ate ccg get gcg tec cag ttg gat tta tec age tgg 8281 
Lys Leu Thr Pro He Pro Ala Ala Ser Gin Leu Asp Leu Ser Ser Trp 
2145 2150 2155 2160 

ttc gtt get ggt tac age ggg gga gac ata tat cac age ctg tct cgt 8329 
Phe Val Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser Arg 

2165 2170 2175 

gec cga ccc cgc tgg ttc acg tgg tgc eta etc eta ctt tct gta ggg 8377 
Ala Arg Pro Arg Trp Phe Thr Trp Cys Leu Leu Leu Leu Ser Val Gly 

2180 • 2185 2190 

gta ggc ate tat eta etc ccc aac cga tga aeggggaget aaacactcca 8427 
Val Gly He Tyr Leu Leu Pro Asn Arg * 
2195 2200 

ggecaatagg ccatcctgtt tttttccctt ttttcccttt tttttttttt tttttttttt 8487 

tttttttttt tttttttttt ttccccccct tttttcccct ttttttttcc ttttctttcc 8547 

tttggtggct ccatcttagc cctagtcacg gctagctgtg aaaggtccgt gagcegcttg 8607 

actgeagaga gtgetgatae tggcctctct gcagatcaag t 8648 

<210> 6 
<211> 8638 
<212> DNA 
<213> HCV 
<220> 
<221> CDS 

<222> (1802) . . . (8407) 
<400> 6 
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accagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 
tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 
cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 
gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 
gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 360 
ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 
cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 
ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 
acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 
cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 
tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 
aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 
cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 
ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 
ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cg.tgacccat ggcgatgcct 960 
gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 
tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 
ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 
agcgcatcgc cttctatcgc cttcttgacg agttcttctg agttcgcgcc cagatgttaa 12 00 
cagaccacaa cggtttccct ctagcgggat caattccgcc ccccccccta acgttactgg 1260 
ccgaagccgc ttggaataag gccggtgtgc gtttgtctat atgttatttt ccaccatatt 1320 
gccgtctttt ggcaatgtga gggcccggaa acctggccct gtcttcttga cgagcattcc 1380 
taggggtctt tcccctctcg ccaaaggaat gcaaggtctg ttgaatgtcg tgaaggaagc 144 0 
agttcctctg gaagcttctt gaagacaaac aacgtctgta gcgacccttt gcaggcagcg 1500 
gaacccccca cctggcgaca ggtgcctctg cggccaaaag ccacgtgtat aagatacacc 1560 
tgcaaaggcg gcacaacccc agtgccacgt tgtgagttgg atagttgtgg aaagagtcaa 1620 
atggctctcc tcaagcgtat tcaacaaggg gctgaaggat gcccagaagg taccccattg 1680 
tatgggatct gatctggggc ctcggtgcac atgctttaca tgtgtttagt cgaggttaaa 1740 
aaacgtctag gccccccgaa ccacggggac gtggttttcc tttgaaaaac acgataatac 1800 
c atg gac egg gag atg gca gca teg tgc gga ggc gcg gtt 'ttc gta ggt 1849 
Met Asp Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly 
15 10 15 

ctg ata etc ttg ace ttg tea ccg cac tat aag ctg ttc etc get agg 1897 
Leu lie Leu Leu Thr Leu Ser Pro His Tyr Lys Leu Phe Leu Ala Arg 

20 25 30 

etc ata tgg tgg tta caa tat ttt ate acc agg gee gag gca cac ttg 1945 
Leu lie Trp Trp Leu Gin Tyr Phe lie Thr Arg Ala Glu Ala His Leu 
35 40 45 

caa gtg tgg ate ccc ccc etc aac gtt egg ggg ggc cgc gat gec gtc 1993 
Gin Val Trp lie Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val 
50 55 60 

ate etc etc acg tgc gcg ate cac cca gag eta ate ttt acc ate acc 2041 
He Leu Leu Thr Cys Ala He His Pro Glu Leu He Phe Thr He Thr 
65 70 75 80 

» 

aaa ate ttg etc gee ata etc ggt cca etc atg gtg etc cag get ggt 208 9 

Lys He Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala Gly 

85 90 95 

ata acc aaa gtg ccg tac ttc gtg cgc gca cac ggg etc att cgt gca 2137 
He Thr Lys Val Pro Tyr Phe Val Arg Ala His Gly Leu He Arg Ala 

100 105 110 
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tgc atg ctg gtg egg aag gtt get ggg ggt cat tat gtc caa atg get 2185 
Cys Met Leu Val Arg Lys Val Ala Gly Gly His Tyr Val Gin Met Ala 
115 120 125 

etc atg aag ttg gec gca ctg aca ggt acg tac gtt tat gac cat etc 2233 
Leu Met Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu 
130 135 140 

ace cca ctg egg gac tgg gee cac gcg ggc eta cga gac ctt gcg gtg 2281 
Thr Pro Leu Arg Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val 
145 150 155 160 

gca gtt gag ccc gtc gtc ttc tct gat atg gag acc aag gtt ate ace 2329 
Ala Val Glu Pro Val Val Phe Ser Asp Met Glu Thr Lys Val lie Thr 

165 170 175 

tgg ggg gca gac acc gcg gcg tgt ggg gac ate ate ttg ggc ctg ccc 2377 
Trp Gly Ala Asp Thr Ala Ala Cys Gly Asp lie lie Leu Gly Leu Pro 

180 185 190 

gtc tec gee cgc agg ggg agg gag ata cat ctg gga ccg gca gac age 2425 
Val Ser Ala Arg Arg Gly Arg Glu lie His Leu Gly Pro Ala Asp Ser 
195 200 205 

ctt gaa ggg cag ggg tgg cga etc etc gcg cct att acg gec tac tec 2473 
Leu Glu Gly Gin Gly Trp Arg Leu Leu Ala Pro lie Thr Ala Tyr Ser 
210 215 220 

< 

caa cag acg cga ggc eta ctt ggc tgc ate ate act age etc aca ggc 2521 
Gin Gin Thr Arg Gly Leu Leu Gly Cys lie lie Thr Ser Leu Thr Gly 
225 230 235 240 

egg gac agg aac cag gtc gag ggg gag gtc caa gtg gtc tec acc gca 2569 
Arg Asp Arg Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala 

245 250 255 

aca caa tct ttc ctg gcg acc tgc gtc aat ggc gtg tgt tgg act gtc 2617 
Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val 

260 265 270 

tat cat ggt gee ggc tea aag acc ctt gee ggc cca aag ggc cca ate 2665 
Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro lie 
275 280 285 

acc caa atg tac acc aat gtg gac cag gac etc gtc ggc tgg caa gcg 2713 
Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin Ala 
290 295 300 

ccc ccc ggg gcg cgt tec ttg aca cca tgc acc tgc ggc age teg gac 2761 
Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp 
305 310 315 320 

ctt tac ttg gtc acg agg cat gec gat gtc att ccg gtg cgc egg egg 2809 
Leu Tyr Leu Val Thr Arg His Ala Asp Val lie Pro Val Arg Arg Arg 

325 330 335 
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ggc gac ggc agg ggg age eta etc tec ccc agg ccc gtc tec tac ttg 2857 
Gly Asp Gly Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu 

340 345 350 

aag ggc tct teg ggc ggt cca ctg etc tgc ccc teg ggg cac get gtg 2905 
Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ser Gly His Ala Val 
355 360 365 

ggc ate ttt egg get gee gtg tgc ace cga ggg gtt gcg aag gcg gtg 2953 
Gly lie Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val 
370 375 380 

gac ttt gta ccc gtc gag tct atg gga ace act atg egg tec ccg gtc 3001 
Asp Phe Val Pro Val Glu Ser Met Gly Thr Thr Met Arg Ser Pro Val 
385 390 395 400 

ttc acg gac aac teg tec cct ccg gee gta ccg cag aca ttc cag gtg 3049 
Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Thr Phe Gin Val 

405 410 415 

gec cat eta cac gee cct act ggt age ggc aag age act aag gtg ccg 3097 
Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

420 425 430 

■ 

get gcg tat gca gee caa ggg tat aag gtg ctt gtc ctg aac ccg tec 3145 
Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser 
435 440 445 

gtc gec gee ace eta ggt ttc ggg gcg tat atg tct aag gca cat ggt 3193 
Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly 
450 455 460 

ate gac cct aac ate aga acc ggg gta agg ace ate ace acg ggt gec 3241 
He Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ala 
465 470 475 480 

ccc ate acg tac tec acc tat ggc aag ttt ctt gec gac ggt ggt tgc 3289 
Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys 

485 490 495 

tct ggg ggc gec tat gac ate ata ata tgt gat gag tgc cac tea act 333 7 
Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr 

500 505 510 

gac teg acc act ate ctg ggc ate ggc aca gtc ctg gac caa gcg gag 3385 
Asp Ser Thr Thr He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu 
515 520 525 

acg get gga gcg cga etc gtc gtg etc gee acc get acg cct ccg gga 3433 
Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 
530 535 5'40 

teg gtc acc gtg cca cat cca aac ate gag gag gtg get ctg tec age 3481 
Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser Ser 
545 550 555 560 
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act gga gaa ate ccc ttt tat ggc aaa gec ate ccc ate gag acc ate 3529 
Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro He Glu Thr He 

565 570 575 

aa Sf 999 999 *99 cac ctc att ttc tg c ca t tec aag aag aaa tgt gat 3577 
Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp 

580 585 590 

gag ctc gee gcg aag ctg tec ggc ctc gga etc aat get gta gca tat 3625 
Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly Leu Asn Ala Val Ala Tyr 
595 600 605 

tac egg ggc ctt gat gta tec gtc ata cca act age gga gac gtc att 3673 
Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val He 
610 615 620 

gtc gta gca acg gac get eta atg acg ggc ttt acc ggc gat ttc gac 3721 
Val Val Ala Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp 
625 630 635 640 

tea gtg ate gac tgc aat aca tgt gtc acc cag aca gtc gac ttc age 3769 
Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser 

645 650 655 

ctg gac ccg acc ttc acc att gag acg acg acc gtg cca caa gac gcg 3817 
Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala 

660 665 670 

gtg tea cgc teg cag egg cga ggc agg act ggt agg ggc agg atg ggc 3865 
Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Met Gly 
675 680 685 

att tac agg ttt gtg act cca gga gaa egg ccc teg ggc atg ttc gat 3913 
He Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp 
690 695 700 

tec teg gtt ctg tgc gag tgc tat gac gcg ggc tgt get tgg tac gag 3961 
Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu 
705 710 715 720 

etc acg ccc gee gag acc tea gtt agg ttg egg get tac eta aac aca 4009 
Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr 

725 730 735 

cca ggg ttg ccc gtc tgc cag gac cat ctg gag ttc tgg gag age gtc 4057 
Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ser Val 

740 745 750 

ttt aca ggc ctc acc cac ata gac gee cat ttc ttg tec cag act aag 4105 
Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys 
755 760 765 

cag gca gga gac aac ttc ccc tac ctg gta gca tac cag get acg gtg 4153 
Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 
770 775 780 
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tgc gcc agg get cag get cca cct cca teg tgg gac caa atg tgg aag 4201 
Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys 
785 790 795 800 

tgt etc ata egg eta aag cct acg ctg cac ggg cca acg ccc ctg ctg 4249 
Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu 

805 810 815 

tat agg ctg gga gcc gtt caa aac gag gtt act acc aca cac ccc ata 4297 
Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Thr Thr His Pro lie 

820 825 830 

acc aaa tac ate atg gca tgc atg teg get gac ctg gag gtc gtc acg 4345 
Thr Lys Tyr lie Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr 
835 840 845 

age acc tgg gtg ctg gta ggc gga gtc eta gca get ctg gcc gcg tat 4393 
Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr 
850 855 860 

tgc ctg aca aca ggc age gtg gtc att gtg ggc agg ate ate ttg tec 4441 
Cys Leu Thr Thr Gly Ser Val Val lie Val Gly Arg lie He Leu Ser 
865 870 875 880 

gga aag ccg gcc ate att ccc gac agg gaa gtc ttt tac egg gag ttc 4489 
Gly Lys Pro Ala He He Pro Asp Arg Glu Val Phe Tyr Arg Glu Phe 

885 890 895 

gat gag atg gaa gag tgc gcc tea cac etc cct tac ate gaa cag gga 4537 
Asp Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly 

900 905 910 

atg cag etc gcc gaa caa ttc aaa cag aag gca ate ggg ttg ctg caa 4585 
Met Gin Leu Ala Glu Gin Phe Lys -Gin Lys Ala He Gly Leu Leu Gin 
915 920 925 

aca gcc acc aag caa gcg gag get get get ccc gtg gtg gaa tec aag 4633 
Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys 
930 935 940 

tgg egg acc etc gaa gcc ttc tgg gcg aag cat atg tgg aat ttc ate 4681 
Trp Arg Thr Leu Glu Ala Phe Trp Ala Lys His Met Trp Asn Phe He 
945 950 955 960 

age ggg ata caa tat tta gca ggc ttg tec act ctg cct ggc aac ccc 4729 
Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro 

965 970 975 

gcg ata gca tea ctg atg gca ttc aca gcc tct ate acc age ccg etc 4777 

Ala He Ala Ser Leu Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu 

980 985 990 

* 

acc acc caa cat acc etc ctg ttt aac ate ctg ggg gga tgg gtg gcc 4825 

Thr Thr Gin His Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala 
995 1000 1005 
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gcc caa ctt get cct ccc age get get tct get ttc gta ggc gec ggc 4873 
Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly 
1010 1015 1020 

ate get gga gcg get gtt ggc age ata ggc ctt ggg aag gtg ctt gtg 4921 
lie Ala Gly Ala Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val 
1025 1030 1035 1040 

gat att ttg gca ggt tat gga gca ggg gtg gca ggc gcg etc gtg gcc 4969 
Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala 

1045 1050 1055 

ttt aag gtc atg age ggc gag atg ccc tec acc gag gac ctg gtt aac 5017 
Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn 

1060 1065 1070 

eta etc cct get ate etc tec cct ggc gcc eta gtc gtc ggg gtc gtg 5065 
Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val 
1075 1080 1085 

tgc gca gcg ata ctg cgt egg cac gtg ggc cca ggg gag ggg get gtg 5113 
Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val 
1090 1095 1100 

cag tgg atg aac egg ctg ata gcg ttc get teg egg ggt aac cac gtc 5161 
Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val 
1105 1110 1115 1120 

tec ccc acg cac tat gtg cct gag age gac get gca gca cgt gtc act 5209 
Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr 

1125 1130 1135 

cag ate etc tct agt ctt acc ate act cag ctg ctg aag agg ctt cac 5257 
Gin He Leu Ser Ser Leu Thr He Thr Gin Leu Leu Lys Arg Leu His 

1140 1145 1150 

cag tgg ate aac gag gac tgc tec acg cca tgc tec ggc teg tgg eta 5305 
Gin Trp He Asn Glu Asp Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu 
1155 1160 1165 

* 

aga gat gtt tgg gat tgg ata tgc acg gtg ttg act gat ttc aag acc 5353 
Arg Asp Val Trp Asp Trp He Cys Thr Val Leu Thr Asp Phe Lys Thr 
1170 1175 1180 

tgg etc cag tec aag etc ctg ccg cga ttg ccg gga gtc ccc ttc ttc 5401 
Trp Leu Gin Ser Lys Leu Leu Pro Arg Leu Pro Gly Val Pro Phe Phe 
1185 1190 1195 1200 

tea tgt caa cgt ggg tac aag gga gtc tgg egg ggc gac ggc ate atg 5449 
Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He Met 

1205 1210 1215 



caa acc acc tgc cca tgt gga gca cag ate acc gga cat gtg aaa aac 5497 
Gin Thr Thr Cys Pro Cys Gly Ala Gin He Thr Gly His Val Lys Asn 

1220 1225 1230 
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cgt tec atg agg ate gtg ggg cct agg acc tgt agt aac acg tgg cat 5545 
Arg Ser Met Arg lie Val Gly Pro Arg Thr Cys Ser Asn Thr Trp His 
1235 1240 1245 

gga aca ttc ccc att aac gcg tac acc acg ggc ccc tgc acg ccc tec 5593 
Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser 
1250 1255 1260 

ccg gcg cca aat tat tct agg gcg ctg tgg egg gtg get get gag gag 5641 
Pro Ala Pro Asn Tyr Ser Arg Ala Leu Trp Arg Val Ala Ala Glu Glu 
1265 1270 1275 1280 

tac gtg gag gtt acg egg gtg ggg gat ttc cac tac gtg acg ggc atg 5689 
Tyr Val Glu Val Thr Arg Val Gly Asp Phe His Tyr Val Thr Gly Met 

1285 1290 1295 

acc act gac aac gta aag tgc ccg tgt cag gtt ccg gee ccc gaa ttc 5737 
Thr Thr Asp Asn Val Lys Cys Pro Cys Gin Val Pro Ala Pro Glu Phe 

1300 1305 1310 

ttc aca gaa gtg gat ggg gtg egg ttg cac agg tac get cca gcg tgc 5785 
Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys 
1315 1320 1325 

aaa ccc etc eta egg gag gag gtc aca ttc ctg gtc ggg etc- aat caa 5833 
Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Leu Val Gly Leu Asn Gin 
1330 1335 1340 

tac ctg gtt ggg tea cag etc cca tgc gag ccc gaa ccg gac gta gca 5881 
Tyr Leu Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala 
1345 1350 1355 1360 

gtg etc act tec atg etc acc gac ccc tec cac att acg gcg gag acg 5929 
Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Thr 

1365 1370 1375 

get aag cgt agg ctg gee agg gga tct ccc ccc tec ttg gee age tea 5977 
Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser Ser 

1380 1385 1390 

tea get age cag ctg tct gcg cct tec ttg aag gca aca tgc act acc 6025 
Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Thr 
1395 1400 1405 

cgt cat gac tec ccg gac get gac etc ate gag gee aac etc ctg tgg 6073 
Arg His Asp Ser Pro Asp Ala Asp Leu lie Glu Ala Asn Leu Leu Trp 
1410 1415 1420 

egg cag gag atg ggc ggg aac ate acc cgc gtg gag tea gaa aat aag 6121 
Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn Lys 
1425 1430 1435 1440 

gta gta att ttg gac tct ttc gag ccg etc caa gcg gag gag gat gag 6169 
Val Val He Leu Asp Ser Phe Glu Pro Leu Gin Ala Glu Glu Asp Glu 

1445 1450 1455 
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agg gaa gta tec gtt ccg gcg gag ate ctg egg agg tec agg aaa ttc 6217 
Arg Glu Val Ser Val Pro Ala Glu lie Leu Arg Arg Ser Arg Lys Phe 

1460 1465 1470 

* 

cct cga gcg atg ccc ata tgg gca cgc ccg gat tac aac cct cca ctg 6265 
Pro Arg Ala Met Pro lie Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu 
1475 1480 1485 

tta gag tec tgg aag gac ccg gac tac gtc cct cca gtg gta cac ggg 6313 
Leu Glu Ser Trp Lys Asp Pro Asp Tyr Val Pro Pro Val Val His Gly 
1490 1495 1500 

tgt cca ctg ccg cct gec aag gee cct ccg ata cca cct cca egg agg 6361 
Cys Pro Leu Pro Pro Ala Lys Ala Pro Pro lie Pro Pro Pro Arg Arg 
1505 1510 1515 1520 

aag agg acg gtt gtc ctg tea gaa tct ace gtg tct tct gee ttg gcg 6409 
Lys Arg Thr Val Val Leu Ser Glu Ser Thr Val Ser Ser Ala Leu Ala 

1525 1530 1535 

gag etc gee aca aag ace ttc ggc age tec gaa teg teg gec gtc gac 6457 
Glu Leu Ala Thr Lys Thr Phe Gly Ser Ser Glu Ser Ser Ala Val Asp 

1540 1545 1550 

age ggc acg gca acg gee tct cct gac cag ccc tec gac gac ggc gac 6505 
Ser Gly Thr Ala Thr Ala Ser Pro Asp Gin Pro Ser Asp Asp Gly Asp 
1555 1560 1565 

gcg gga tec gac gtt gag teg tac tec tec atg ccc ccc ctt gag ggg 6553 
Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly 
1570 1575 1580 

gag ccg ggg gat ccc gat etc age gac ggg cct tgg tct ace gta age 6601 
Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Pro Trp Ser Thr Val Ser 
1585 1590 1595 1600 

gag gag get agt gag gac gtc gtc tgc tgc teg atg tec tac aca tgg 6649 
Glu Glu Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp 

1605 1610 1615 

aca ggc gec ctg ate acg cca tgc get gcg gag gaa ace aag ctg ccc 6697 
Thr Gly Ala Leu lie Thr Pro Cys Ala Ala Glu Glu Thr Lys Leu Pro 

1620 1625 1630 

> 

ate aat gca ctg age aac tct ttg etc cgt cac cac aac ttg gtc tat 6745 
lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr 
1635 1640 1645 

get aca aca tct cgc age gca age ctg egg cag aag aag gtc ace ttt 6793 
Ala Thr Thr Ser Arg Ser Ala Ser Leu Arg Gin Lys Lys Val Thr Phe 
1650 1655 1660 

gac aga ctg cag gtc ctg gac gac cac tac egg gac gtg etc aag gag 6841 
Asp Arg Leu Gin Val Leu Asp Asp His Tyr Arg Asp Val Leu Lys Glu 
1665 1670 1675 1680 
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atg aag gcg aag gcg tec aca gtt aag get aaa ctt eta tec gtg gag 6889 
Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu Leu Ser Val Glu 

1685 1690 1695 

gaa gec tgt aag ctg acg ccc cca cat teg gee aga tct aaa ttt ggc 6937 
Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Arg Ser Lys Phe Gly 

1700 1705 1710 

ta t ggg gca aag gac gtc egg aac eta tec age aag gec gtt aac cac 6985 
Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys Ala Val Asn His 
1715 1720 1725 

ate cgc tec gtg tgg aag gac ttg ctg gaa gac act gag aca cca att 7033 
He Arg Ser Val Trp Lys Asp Leu Leu Glu Asp Thr Glu Thr Pro He 
1730 1735 1740 

gac acc ace ate atg gca aaa aat gag gtt ttc tgc gtc caa cca gag .7081 
Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu 
1745 1750 1755 1760 

aa 9 999 ggc cgc aag cca get cgc ctt ate gta ttc cca gat ttg ggg 7129 
Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly 

1765 1770 1775 

gtt cgt gtg tgc gag aaa atg gee ctt tac gat gtg gtc tec acc etc 7177 
Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Thr Leu 

1780 1785 1790 

■ 

cct cag gec gtg atg ggc tct tea tac gga ttc caa tac tct cct gga 7225 
Pro Gin Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly 
1795 1800 1805 

cag egg gtc gag ttc ctg gtg aat gec tgg aaa gcg aag aaa tgc cct 7273 
Gin Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ala Lys Lys Cys Pro 
1810 1815 1820 

atg ggc ttc gca tat gac acc cgc tgt ttt gac tea acg gtc act gag 7321 
Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1825 1830 1835 1840 

aat gac ate cgt gtt gag gag tea ate tac caa tgt tgt gac ttg gec 7369 
Asn Asp He Arg Val Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Ala 

1845 1850 1855 

ccc gaa gee aga cag gee ata agg teg etc aca gag egg ctt tac ate 7417 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

1860 1865 1870 

ggg ggc ccc ctg act aat tct aaa ggg cag aac tgc ggc tat cgc egg 7465 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 
1875 1880 1885 

tgc cgc gcg age ggt gta ctg acg acc age tgc ggt aat acc etc aca 7513 
Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 
1890 1895 1900 
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tgt tac ttg aag gcc get gcg gec tgt cga get gcg aag etc cag gac 7561 
Cys Tyr Leu Lye Ala Ala Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
1905 1910 1915 1920 

tgc acg atg etc gta tgc gga gac gac ctt gtc gtt ate tgt gaa age 7609 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

1925 1930 1935 

* 

gcg ggg ace caa gag gac gag gcg age eta egg gcc ttc acg gag get 7657 
Ala Gly Thr Gin Glu Asp Glu Ala Ser Leu Arg Ala Phe Thr Glu Ala 

1940 1945 1950 

atg act aga tac tct gcc ccc cct ggg gac ccg ccc aaa cca gaa tac 7705 
Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Lys Pro Glu Tyr 
1955 1960 1965 

gac ttg gag ttg ata aca tea tgc tec tec aat gtg tea gtc gcg cac 7753 
Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 
1970 1975 1980 

gat gca tct ggc aaa agg gtg tac tat etc ace cgt gac ccc ace acc 7801 
Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
1985 1990 1995 2000 

ccc ctt gcg egg get gcg tgg gag aca get aga cac act cca gtc aat 7849 
Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 

2005 2010 2015 

tec tgg eta ggc aac ate ate atg tat gcg ccc acc ttg tgg gca agg 7897 
Ser Trp Leu Gly Asn He He Met Tyr Ala Pro Thr Leu Trp Ala Arg 

2020 2025 2030 

atg ate ctg atg act cat ttc ttc tec ate ctt eta get cag gaa caa 7945 
Met He Leu Met Thr His Phe Phe Ser He Leu Leu Ala Gin Glu Gin 
2035 2040 2045 

ctt gaa aaa gcc eta gat tgt cag ate tac ggg gcc tgt tac tec att 7993 
Leu Glu Lys Ala Leu Asp Cys Gin He Tyr Gly Ala Cys Tyr Ser He 
2050 2055 2060 

gag cca ctt gac eta cct cag ate att caa cga etc cac ggc ctt age 8041 
Glu Pro Leu Asp Leu Pro Gin He He Gin Arg Leu His Gly Leu Ser 
2065 2070 2075 2080 

gca ttt tea etc cat agt tac tct cca ggt gag ate aat agg gtg get 8089 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala 

2085 2090 2095 

tea tgc etc agg aaa ctt ggg gta ccg ccc ttg cga gtc tgg aga cat 8137 
Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

2100 2105 2110 

egg gcc aga agt gtc cgc get agg eta ctg tec cag ggg ggg agg get 8185 
Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ser Gin Gly Gly Arg Ala 
2115 2120 2125 
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gcc act tgt ggc aag tac etc ttc aac tgg gca gta agg acc aag etc 8233 
Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 
2130 2135 2140 

aaa etc act cca ate ccg get gcg tec cag ttg gat tta tec age tgg 8281 
Lys Leu Thr Pro lie Pro Ala Ala Ser Gin Leu Asp Leu Ser Ser Trp 
2145 2150 2155 2160 

ttc gtt get ggt tac age ggg gga gac ata tat cac age ctg tct cgt 8329 
Phe Val Ala Gly Tyr Ser Gly Gly Asp lie Tyr His Ser Leu Ser Arg 

2165 2170 2175 

gcc cga ccc cgc tgg ttc atg tgg tgc eta etc eta ctt tct gta ggg 8377 
Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly 

2180 2185 2190 

gta ggc ate tat eta etc ccc aac cga tga aeggggaget aaacactcea 8427 
Val Gly lie Tyr Leu Leu Pro Asn Arg * 
2195 2200 

ggecaatagg ccatcctgtt tttttttttt tttttttttt tttttttttt tttttttttt 8487 

tttttttttt tttttttttt ttttttcctc ttttttttcc ttttctttcc tttggtggct 8547 

ccatcttagc cctagtcacg gctagctgtg aaaggtccgt gagcegcttg actgeagaga 8607 

gtgetgatae tggcctctct gcagatcaag t 8638 

<210> 7 
<211> 8638 
<212> DNA 
<213> HCV 

<220> 
<221> CDS 

<222> (1802) . . . (8407) 
<400> 7 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 
tcttcacgca gaaagegtet agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 
cccccctccc gggagageca tagtggtctg eggaaceggt gagtacaccg gaattgccag 180 
gaegaceggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgagactgc tagecgagta gtgttgggtc gegaaaggee ttgtggtact gcc tga t agg 300 
gtgettgega gtgccccggg aggtctegta gaccgtgcac catgagcacg aatcctaaac 360 
ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgeae gcaggttctc 420 
cggccgcttg ggtggagagg etattegget atgactgggc acaacagaca ateggctget 480 
ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 
acctgtccgg tgccctgaat gaactgeagg acgaggcagc geggctateg tggctggcca 600 
egaegggegt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 
tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 
aagtatccat catggctgat geaatgegge ggctgeatae gcttgatccg gctacctgcc 780 
cattcgacca ecaagegaaa catcgcatcg agegagcacg tacteggatg gaagccggtc 840 
ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 
ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggegatgect 960 
gettgecgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 
tgggtgtggc ggacegctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 
ttggcggcga atgggctgac cgcttcctcg tgetttaegg tatcgccgct cccgattcgc 1140 
agcgcatcgc cttctatcgc cttcttgacg agttcttctg agttcgcgcc cagatgttaa 1200 
cagaccacaa cggtttccct etagegggat caattccgcc ccccccccta acgttactgg 1260 
ccgaagccgc ttggaataag gccggtgtgc gtttgtctat atgttatttt ccaccatatt 1320 
geegtctttt ggcaatgtga gggcccggaa acctggccct gtcttcttga cgagcattcc 1380 
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taggggtctt tcccctctcg ccaaaggaat gcaaggtctg ttgaatgtcg tgaaggaagc 1440 
agttcctctg gaagcttctt gaagacaaac aacgtctgta gcgacccttt gcaggcagcg 1500 
gaacccccca cctggcgaca ggtgcctctg cggccaaaag ccacgtgtat aagatacacc 1560 
tgcaaaggcg gcacaacccc agtgccacgt tgtgagttgg atagttgtgg aaagagtcaa 1620 
atggctctcc tcaagcgtat tcaacaaggg gctgaaggat gcccagaagg taccccattg 1680 
tatgggatct gatctggggc ctcggtgcac atgctttaca tgtgtttagt cgaggttaaa 1740 
aaacgtctag gccccccgaa ccacggggac gtggttttcc tttgaaaaac acgataatac 1800 
c atg gac egg gag atg gca gca teg tgc gga ggc gcg gtt ttc gta ggt 1849 
Met Asp Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly 
1 5 10 15 

ctg at a etc ttg acc ttg tea ccg cac tat aag ctg ttc etc get agg 1897 
Leu lie Leu Leu Thr Leu Ser Pro His Tyr Lys Leu Phe Leu Ala Arg 

20 25 30 

etc ata tgg tgg tta caa tat ttt ate acc agg gee gag gca cac ttg 1945 
Leu lie Trp Trp Leu Gin Tyr Phe lie Thr Arg Ala Glu Ala His Leu 
35 40 45 

caa gtg tgg ate ccc ccc etc aac gtt egg ggg ggc cgc gat gee gtc 1993 
Gin Val Trp He Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val 
50 55 60 

ate etc etc acg tgc gcg ate cac cca gag eta ate ttt acc ate acc 2041 
He Leu Leu Thr Cys Ala He His Pro Glu Leu He Phe Thr lie Thr 
65 70 75 80 

aaa ate ttg etc gee ata etc ggt cca etc atg gtg etc cag" get ggt 2089 
Lys He Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala Gly 

85 90 95 

ata acc aaa gtg ccg tac ttc gtg cgc gca cac ggg etc att cgt gca 2137 
He Thr Lys Val Pro Tyr Phe Val Arg Ala His Gly Leu He Arg Ala 

100 105 110 

tgc atg ctg gtg egg aag gtt get ggg ggt cat tat gtc caa atg get 2185 
Cys Met Leu Val Arg Lys Val Ala Gly Gly His Tyr Val Gin Met Ala 
115 120 125 

* 

etc atg aag ttg gee gca ctg aca ggt acg tac gtt tat gac cat etc 2233 
Leu Met Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu 
130 135 140 

acc cca ctg egg gac tgg gee cac gcg ggc eta cga gac ctt gcg gtg 2281 
Thr Pro Leu Arg Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val 
145 150 155 160 

gca gtt gag ccc gtc gtc ttc tct gat atg gag acc aag gtt ate acc 2329 
Ala Val Glu Pro Val Val Phe Ser Asp Met Glu Thr Lys Val He Thr 

165 170 175 

tgg ggg gca gac acc gcg gcg tgt ggg gac ate ate ttg ggc ctg ccc 2377 
Trp Gly Ala Asp Thr Ala Ala Cys Gly Asp He He Leu Gly Leu Pro 

180 185 190 

gtc tec gec cgc agg ggg agg gag ata cat ctg gga ccg gca gac age 2425 
Val Ser Ala Arg Arg 6ly Arg Glu He His Leu Gly Pro Ala Asp Ser 
195 200 205 
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ctt gaa ggg cag ggg tgg cga etc etc gcg cct att acg gee tac tec 2473 
Leu Glu Gly Gin Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ser 
210 215 220 

caa cag acg cga ggc eta ctt ggc tgc ate ate ace age etc aca ggc 2521 
Gin Gin Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly 
225 230 235 240 

egg gac agg aac cag gtc gag ggg gag gtc caa gtg gtc tec acc gca 2569 
Arg Asp Arg Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala 

245 250 255 

aca caa tct ttc ctg gcg acc tgc gtc aat ggc gtg tgt tgg act gtc 2617 
Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val 

260 265 270 

tat cat ggt gec ggc tea aag acc ctt gee ggc cca aag ggc cca ate 2665 
Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro He 
275 280 285 

acc caa atg tac acc aat gtg gac cag gac etc gtc ggc tgg caa gcg 2713 
Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin Ala 
290 295 300 

ccc ccc ggg gcg cgt tec ttg aca cca tgc acc tgc ggc age teg gac 2761 
Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp 
305 310 315 320 

ctt tac ttg gtc acg aag cat gee gat gtc att ccg gtg cgc egg egg 2809 
Leu Tyr Leu Val Thr Lys His Ala Asp Val He Pro Val Arg Arg Arg 

325 330 335 

ggc gac age agg ggg age eta etc tec ccc egg ccc gtc tec tac ttg 2857 
Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu 

340 345 350 

aag ggc tct teg ggc ggt cca ctg etc tgc ccc teg ggg cac get gtg 2905 
Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Ero Ser Gly His Ala Val 
355 360 365 

ggc ate ttt egg get gee gtg tgc acc cga ggg gtt gcg aag gcg gtg 2953 
Gly He Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val 
370 375 380 

gac ttt gta ccc gtc gag tct atg gaa acc act atg egg tec ccg gtc 3001 
Asp Phe Val Pro Val Glu Ser Met Glu Thr Thr Met Arg Ser Pro Val 
385 ' 390 395 400 

ttc acg gac aac teg tec cct ccg gee gta ccg cag aca ttc cag gtg 3049 
Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Thr Phe Gin Val 

405 410 415 



gec cat eta cac gee cct act ggt age ggc aag age act aag gtg ccg 
Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

420 425 430 



3097 
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get gcg tat gca gec caa ggg tat aag gtg ctt gtc ctg aac ccg tec 3145 
Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser 
435 440 445 

gtc gec gec acc eta ggt ttc ggg gcg tat atg tct aag gca cat ggt 3193 
Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly 
450 455 460 

ate gac cct aac ate aga acc ggg gta agg acc ate acc acg ggt gec 3241 
lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly Ala 
465 470 475 480 

ccc ate acg tac tec acc tat ggc aag ttt ctt gec gac ggt ggt tgc 3289 
Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys 

485 490 495 

tct ggg ggc gee tat gac ate ata ata tgt gat gag tgc cac tea act 3337 
Ser Gly Gly Ala Tyr Asp lie lie lie Cys Asp Glu Cys His Ser Thr 

500 505 510 * 

gac teg acc act ate ctg ggc ate ggc aca gtc ctg gac caa gcg gag 3385 
Asp Ser Thr Thr lie Leu Gly lie Gly Thr Val Leu Asp Gin Ala Glu 
515 520 525 

acg get gga gcg cga etc gtc gtg etc gec acc get acg cct ccg gga 3433 
Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 
530 ■ 535 540 

teg gtc acc gtg cca cat cca aac ate gag gag gtg get ctg tec age 3481 
Ser Val Thr Val Pro His Pro Asn lie Glu Glu Val Ala Leu Ser Ser 
545 550 555 560 

act gga gaa ate ccc ttt tat ggc aaa gee ate ccc ate gag acc ate 3529 
Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala He Pro He Glu Thr He 

565 570 575 

aag ggg ggg agg cac etc att ttc tgc cat tec aag aag aaa tgc gat 3577 
Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp 

580 585 590 

gag etc gee gcg aag ctg tec ggc etc gga etc aat get gta gca tat 3625 
Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly Leu Asn Ala Val Ala Tyr 
595 600 605 

tac egg ggc ctt gat gta tec gtc ata cca act age gga gac gtc att 3673 
Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val He 
610 615 620 

gtc gta gca acg gac get eta atg acg ggc ttt acc ggc gat ttc gac 3721 
Val Val Ala Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp 
625 630 635 640 

tea gtg ate gac tgc aat aca tgt gtc acc cag aca gtc gac ttc age 3769 
Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser 

645 650 655 
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ctg gac ccg acc ttc acc att gag acg acg acc gtg cca caa gac gcg 3817 
Leu Asp Pro Thr Phe Thr lie Glu Thr Thr Thr Val Pro Gin Asp Ala 

660 665 670 

gtg tea cgc teg cag egg cga ggc agg act ggt agg ggc agg atg ggc 3865 
Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Met Gly 
675 680 685 

att tac agg ttt gtg act cca gga gaa egg ccc teg ggc atg ttc gat 3913 
lie Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp 
690 695 700 

tec teg gtt ctg tgc gag tgc tat gac gcg ggc tgt get tgg tac gag 3961 
Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu 
705 710 715 720 

etc acg ccc gee gag acc tea gtt agg ttg egg get tac eta aac aca 4009 
Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr . 

725 730 735 

cca ggg ttg ccc gtc tgc cag gac cat ctg gag ttc tgg gag ggc gtc 4057 
Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val 

740 745 750 

ttt aca ggc etc acc cac ata gac gee cat ttc ttg tec cag act aag 4105 
Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys « 
755 760 765 

■ 

cag gca gga gac aac ttc ccc tac ctg gta gca tac cag get acg gtg 4153 
Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 
770 775 780 

tgc -gee agg get cag get cca cct cca teg tgg gac caa atg tgg aag 4201 
Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys 
785 790 795 800 

tgt etc ata egg eta aag cct acg ctg cac ggg cca acg ccc ctg ctg 4249 
Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu 

805 810 815 

■ 

tat agg ctg gga gee gtt caa aac gag gtt act acc aca cac ccc ata 4297 
Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Thr Thr His Pro He 

820 825 830 

acc aaa tac ate atg gca tgc atg teg get gac ctg gag gtc gtc acg 4345 
Thr Lys Tyr He Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr 
835 840 845 

age acc tgg gtg ctg gta ggc gga gtc eta gca get ctg get gcg tat 4393 
Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr 
850 855 860 

tgc ctg aca aca ggc age gtg gtc att gtg ggc agg ate ate ttg tec 4441 
Cys Leu Thr Thr Gly Ser Val Val He Val Gly Arg He He Leu Ser 
865 870 875 880 
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gga agg ccg gcc ate att ccc gac agg gaa gtc ctt tac egg gag ttc 4489 

Gly Arg Pro Ala lie lie Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe 

885 890 895 

gat gag atg gaa gag tgt gcc tea cac etc cct tac ate gaa cag gga 4537 
Asp Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly 

900 905 910 

* 

atg cag etc gcc gaa caa ttc aaa cag aag gca ate ggg ttg ctg caa 4585 
Met Gin Leu Ala Glu Gin Phe Lys Gin Lys Ala He Gly Leu Leu Gin 
915 920 925 

aca gcc ace aag caa gcg gag get get get ccc gtg gtg gaa tec aag 4633 
Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys 
930 935 940 

tgg egg acc etc gaa gcc ttc tgg gcg aag cat atg tgg aat ttc ate 4681 
Trp Arg Thr Leu Glu Ala Phe Trp Ala Lys His Met Trp Asn Phe He 
945 950 955 960 

age ggg ata caa tat tta gca ggc ttg tec act ctg cct ggc aac ccc 4729 
Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro 

965 970 975 

gcg ata gca tea ctg atg gca ttc aca gcc tct ate acc age ccg etc 4777 
Ala He Ala Ser Leu Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu 

980 985 990 

acc acc caa cat acc etc ctg ttt aac ate ctg ggg gga tgg gtg gcc 4825 
Thr Thr Gin His Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala 
995 1000 1005 

gcc caa ctt get cct ccc age get get tec get ttc gta ggc gcc ggc 4873 
Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly 
1010 1015 1020 

ate get gga gcg get gtt ggc age ata ggc ctt ggg aag gtg ctt gtg 4921 
He Ala Gly Ala Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val 
1025 1030 1035 1040 

gat att ttg gca ggt tat gga gca ggg gtg gca ggc gcg etc gtg gcc 4969 
Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala 

1045 1050 1055 

ttt aag gtc atg age ggc gag atg ccc tec acc gag gac ctg gtt aac 5017 
Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn 

1060 1065 1070 

eta etc cct get ate etc tec cct ggc gcc eta gtc gtc ggg gtc gtg 5065 
Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Val Gly Val Val 
1075 1080 1085 



tgc gca gcg ata ctg cgt egg cac gtg ggc cca ggg gag ggg get gtg 5113 
Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val 
1090 1095 1100 
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cag tgg atg aac egg ctg ata gcg ttc get teg egg ggt aac cac gtc 5161 
Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val 
1105 1110 1115 1120 

tec ccc acg cac tat gtg cct gag age gac get gca gca cgt gtc act 5209 
Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr 

1125 H30 1135 

cag ate etc tct agt ctt ace ate act cag ctg ctg aag agg ctt cac 5257 
Gin He Leu Ser Ser Leu Thr He Thr Gin Leu Leu Lys Arg Leu His 

1140 1145 1150 

cag tgg ate aac gag gac tgc tec acg cca tgc tec ggc teg tgg eta 5305 
Gin Trp He Asn Glu Asp Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu 
1155 1160 1165 

aga gat gtt tgg gat tgg ata tgc acg gtg ttg. act gat ttc aag gec 5353 
Arg Asp Val Trp Asp Trp He Cys Thr Val Leu Thr Asp Phe Lys Ala 
1170 1175 1180 

tgg etc cag tec aag etc ctg ccg cga ttg ccg gga gtc ccc ttc ttc 5401 
Trp Leu Gin Ser Lys Leu Leu Pro Arg Leu Pro Gly Val Pro Phe Phe 
11 Q 5 1190 1195 1200 

tea tgt caa cgt ggg tac aag gga gtc tgg egg ggc gac ggc ate atg 5449 
Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He Met 

1205 1210 1215 

caa ace ace tgc cca tgt -gga gca cag ate acc gga cat gtg aaa aac 5497 
Gin Thr Thr Cys Pro Cys Gly Ala Gin He Thr Gly His Val Lys Asn 

1220 1225 1230 

tgt tec atg agg ate gtg ggg cct agg acc tgt agt aac acg tgg cat 5545 
Cys Ser Met Arg He Val Gly Pro Arg Thr Cys Ser Asn Thr Trp His 
1235 1240 1245 

gga aca ttc ccc att aac gcg tac acc acg ggc ccc tgc acg ccc tec 5593 
Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser 
1250 1255 1260 

ccg gcg cca aat tat tct agg gcg ctg tgg egg gtg get get gag gag 5641 
Pro Ala Pro Asn Tyr Ser Arg Ala Leu Trp Arg Val Ala Ala Glu Glu 
I 265 1270 1275 1280 

tac gtg gag gtt acg cga gtg ggg gat ttc cac tac gtg acg ggc atg 5689' 
Tyr Val Glu Val Thr Arg Val Gly Asp Phe His Tyr Val Thr Gly Met 

1285 1290 1295 

* 

acc act gac aac gta aag tgc ccg tgt cag gtt ccg" gec ccc gaa ttc 5737 

Thr Thr Asp Asn Val Lys Cys Pro Cys Gin Val Pro Ala Pro Glu Phe 

; 1300 * 1305 1310 

ttc aca gaa gtg gat ggg gtg egg ttg cac agg tac get cca gcg tgc 5785 
Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys 
!315 1320 1325 
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aaa ccc etc eta egg gag gag gtc aca ttc ctg gtc ggg etc aat caa 5833 
Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Leu Val Gly Leu Asn Gin 
1330 1335 1340 

tac ccg gtt ggg tea cag etc cca tgc gag ccc gaa ctg gac gta gca 5881 
Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Leu Asp Val Ala 
1345 1350 1355 1360 

gtg etc act tec atg etc acc gac ccc tec cac att acg gcg gag acg 5929 
Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Thr 

1365 1370 1375 

get aag cgt agg ctg gec agg gga tct ccc ccc tec ttg gee age tea 5977 
Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser Ser 

1380 1385 1390 

tea get age cag ctg tct gcg cct tec ttg aag gca aca tgc act acc 6025 
Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Thr 
1395 1400 1405 

cgt cat gac tec ccg gac get gac etc ate gag gee aac etc ctg tgg 6073 
Arg His Asp Ser Pro Asp Ala Asp Leu lie Glu Ala Asn Leu Leu Trp 
1410 1415 1420 

egg cag gag atg ggc ggg aac ate acc cgc gtg gag tea gag aat aag 6121 
Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn Lys 
1425 1430 1435 1440 

gta gta att ttg gac tct ttc gag ccg etc caa gcg gag gag gat gag 6169 
Val Val lie Leu Asp Ser Phe Glu Pro Leu Gin Ala Glu Glu Asp Glu 

1445 1450 1455 

a 9SJ 9 aa 9 ta tec gtt ccg gcg gag ate ctg egg agg tec agg aaa ttc 6217 
Arg Glu Val Ser Val Pro Ala Glu lie Leu Arg Arg Ser Arg Lys Phe 

1460 1465 1470 

cct cga gcg atg ccc ata tgg gca cgc ccg gat tac aac cct cca ctg 6265 
Pro Arg Ala Met Pro He Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu 
1475 1480 1485 

tta gag tec tgg aag gac ccg gac tac gtc cct cca gtg gta cac ggg 6313 
Leu Glu Ser Trp Lys Asp Pro Asp Tyr Val Pro Pro Val Val His Gly 
1490 1495 1500 

tgt cca ttg ccg cct gee aag gee cct ccg ata cca cct cca egg agg 6361 
Cys Pro Leu Pro Pro Ala Lys Ala Pro Pro He Pro Pro Pro Arg Arg 
1505 1510 1515 1520 

aag agg acg gtt gtc ctg tea gaa tct acc gtg tct tct gec ttg gcg 6409 
Lys Arg Thr Val Val Leu Ser Glu Ser Thr Val Ser Ser Ala Leu Ala 

1525 1530 1535 

gag etc gec aca aag acc ttc ggc age tec gaa teg teg gec gtc gac 6457 
Glu Leu Ala Thr Lys Thr Phe Gly Ser Ser Glu Ser Ser Ala Val Asp 

1540 1545 1550 
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age ggc acg gca acg gec tct cct gac cag ccc tec gac gac ggc gac 6505 
Ser Gly Thr Ala Thr Ala Ser Pro Asp Gin Pro Ser Asp Asp Gly Asp 
1555 1560 1565 

gcg gga tec gac gtt gag. teg tac tec tec atg ccc ccc ctt gag ggg 6553 
Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly 
1570 1575 1580 

gag ccg ggg gat ccc gat etc age gac ggg tct tgg tct ace gta age 6601 
Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser 
1585 1590 1595 1600 

gag gag get agt gag gac gtc gtc tgc tgc teg atg tec tac aca tgg 6649 
Glu Glu- Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp 

1605 1610 1615 

aca ggc gee ctg ate acg cca tgc get gcg gag gaa ace aag ctg ccc 6697 
Thr Gly Ala Leu lie Thr Pro Cys Ala Ala Glu Glu Thr Lys Leu Pro 

1620 1625 1630 

ate aat gca ctg age aac tct ttg etc cgt cac cac aac ttg gtc tat 6745 
He Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr 
1635 1640 1645 

get aca aca tct cgc age gca age ctg egg cag aag aag gtc acc ttt 6793 
Ala Thr Thr Ser Arg Ser Ala Ser Leu Arg Gin Lys Lys Val Thr Phe 
1650 1655 1660 

gac aga ctg cag gtc ctg gac gac cac tac egg gac gtg etc aag gag 6841 
Asp Arg Leu Gin Val Leu Asp Asp His Tyr Arg Asp Val Leu Lys Glu 
1665 1670 1675 1680 

atg aag gcg aag gcg tec aca gtt aag get aaa ctt eta tec gtg gag 6889 
Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu Leu Ser Val Glu 

1685 1690 * 1695 

gaa gee tgt aag ctg acg ccc cca cat teg gee aga tct aaa ttt ggc 6937 
Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Arg Ser Lys Phe Gly 

1700 1705 1710 

• 

tat ggg gca aag gac gtc egg aac eta tec age aag gee gtt aac cac 6985 

Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys Ala Val Asn His 
1715 1720 1725 

ate cgc tec gtg tgg aag gac ttg ctg gaa gac act gag aca cca att 7033 
He Arg Ser Val Trp Lys Asp Leu Leu Glu Asp Thr Glu Thr Pro He 
1730 1735 1740 



gac acc acc ate atg gca aaa aat 
Asp Thr Thr He Met Ala Lys Asn 
1745 1750 

aag ggg ggc cgc aag cca get cgc 
Lys Gly Gly Arg Lys Pro Ala Arg 

1765 



gag gtt ttc tgc gtc caa cca gag 7081 
Glu Val Phe Cys Val Gin Pro Glu 
1755 1760 

ctt ate gta ttc cca gat ttg ggg 7129 
Leu He Val Phe Pro Asp Leu Gly 
1770 1775 
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gtt cgt gtg tgc gag aaa atg gcc ctt tac gat gtg gtc tec acc etc 7177 
Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Thr Leu 

1780 1785 1790 

cct cag gcc gtg atg ggc tct tea tac gga ttc caa tac tct cct gga 7225 
Pro Gin Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly 
1795 1800 1805 

cag egg gtc gag ttc ctg gtg aat gcc tgg aaa gcg aag aaa tgc cct 7273 
Gin Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ala Lys Lys Cys Pro 
1810 1815 1820 

atg ggc ttc gca tat gac acc cgc tgt ttt gac tea acg gtc act gag 7321 
Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1825 1830 1835 1840 

aat gac ate cgt gtt gag gag tea ate tac caa tgt tgt gac ttg gcc 7369 
Asn Asp lie Arg Val Glu Glu Ser lie Tyr Gin Cys Cys Asp Leu Ala 

1845 1850 1855 

ccc gaa gcc aga cag gcc ata agg teg etc aca gag egg ctt tac ate 7417 
Pro Glu Ala Arg Gin Ala lie Arg Ser Leu Thr Glu Arg Leu Tyr lie 

1860 1865 1870 

999 99 c ccc c ^9 act aat tct aaa 999 ca 9 aac tgc ggc tat cgc egg 7465 

Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 
1875 1880 1885 

■ 

tgc cgc gcg age ggt gta ctg acg acc age- tgc ggt aat acc etc aca 7513 

Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 
1890 1895 1900 

tgt tac ttg aag gcc get gcg gcc tgt cga get gcg aag etc cag gac 7561 
Cys Tyr Leu Lys Ala Ala Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
1905 1910 1915 1920 

tgc acg atg etc gta tgc gga gac gac ctt gtc gtt ate tgt gaa age 7609 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

1925 1930 1935 

gcg ggg acc caa gag gac gag gcg age eta egg gcc ttc acg gag get 7657 
Ala Gly Thr Gin Glu Asp Glu Ala Ser Leu Arg Ala Phe Thr Glu Ala 

1940 1945 m 1950 

atg act aga tac tct gcc ccc cct ggg gac ccg ccc aaa cca gaa tac 7705 
Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Lys Pro Glu Tyr 
1955 1960 1965 

gae ttg gag ttg ata aca tea tgc tec tec aat gtg tea gtc gcg cac 7753 
Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 
1970 1975 1980 



gat gca tct ggc aaa agg gtg tac tat etc acc cgt gac ccc acc acc 
Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
1985 1990 1995 ~ 2000 



7801 
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ccc ctt gcg egg get gcg tgg gag aca get aga cac act cca gtc aat 7849 
Pro Leu Ala- Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 

2005 2010 2015 

tec tgg eta ggc aac ate ate atg tat gcg ccc ace ttg tgg gca agg 7897 
Ser Trp Leu Gly Asn He He Met Tyr Ala Pro Thr Leu Trp Ala Arg 

2020 2025 2030 

atg ate ctg atg act cat ttc ttc tec ate ctt eta get cag gaa caa 7945 
Met He Leu Met Thr His Phe Phe Ser He Leu Leu Ala Gin Glu Gin 
2035 2040 2045 

ctt gaa aaa gee eta gat tgt cag ate tac ggg gee tgt tac tec att 7993 
Leu Glu Lys Ala Leu Asp Cys Gin He Tyr Gly Ala Cys Tyr Ser He 
2050 2055 2060 

gag cca ctt gac eta cct cag ate att caa cga etc cac ggc ctt age 8041 
Glu Pro Leu Asp Leu Pro Gin He He Gin Arg Leu His Gly Leu Ser 
2065 2070 2075 2080 

gca ttt tea etc cat agt tac tct cca ggt gag ate aat agg gtg get 8 089 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala 

2085 2090 2095 

tea tgc etc agg aaa ctt ggg gta ccg ccc ttg cga gtc tgg aga cat 8137 
Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

2100 2105 2110 

egg gec aga agt gtc cgc get agg eta ctg tec cag ggg ggg agg get 8185 
Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ser Gin Gly Gly Arg Ala 
2115 2120 2125 

gec act tgt ggc aag tac etc ttc aac tgg gca gta agg acc aag etc 8233 

Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 
2130 2135 2140 

« 

aaa etc act cca ate ccg get gcg tec cag ttg gat tta tec age tgg 8281 

Lys Leu Thr Pro He Pro Ala Ala Ser Gin Leu Asp Leu Ser Ser Trp 

2145 2150 2155 . 2160 

ttc gtt get ggt tac age ggg gga gac at a tat cac age ctg tct cgt 832 9 
Phe Val Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser Arg. 

2165 2170 2175 

gec cga ccc cgc tgg ttc atg tgg tgc eta etc eta ctt tct gta ggg 8377 
Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly 

2180 2185 2190 

gta ggc ate tat eta etc ccc aac cga tga aeggggaget aaacactcca 8427 
Val Gly He Tyr Leu Leu Pro Asn Arg * 
2195 . 2200 

ggecaatagg ccatcctgtt tttttccctt tttttttttc tttttttttt tttttttttt 8487 
tttttttttt ttttctcctt tttttttcct ctttttttcc ttttctttcc tttggtggct 8547 
ccatcttagc cctagtcacg gctagctgtg aaaggtccgt gagcegcttg actgeagaga 8607 
gtgetgatae tggcctctct gcagatcaag t 8638 

<2i0> 8 
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<211> 6 
<212> DNA 
<213> HCV 



<400> 8 
accagc 



6 



<210> 9 
<211> 63 
<212> DNA 
<213> HCV 

<400> 9 

gaattccaga tggcgcgccc agatgttaac cagatccatg gcacactcta gagtactgtc 60 
gac 63 

<210> 10 
<211> 33 
<212> DNA 
<213> HCV 



<210> 11 
<211> 30 
<212> DNA 
<213> HCV 

<400> 11 

ggcgtaccca tggtattatc gtgtttttca 30 

<210> 12 
<211> 45 
<212> DNA 
<213> HCV 



<210> 13 
<211> 45 
<212> DNA 
<213> HCV 

<400> 13 

ggcgcgccct ttggtttttc tttgaggttt aggattcgtg ctcat 45 

<210> 14 
<211> 36 
<212> DNA 
<213> HCV 



<400> 10 

cggaatcgtt aacagaccac aacggtttcc etc 



33 



<400> 12 

gcatatgaat tctaatacga ctcactatag gccagccccc gattg 



45 



<400> 14 

aaagggegea tgattgaaca agatggattg cacgea 



36 



<210> 15 
<211> 39 
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<212> DNA 
<213> HCV 



<400> 15 

gcatatgtta actcagaaga actcgtcaag aaggcgata 



39 



<210> 16 
<211> 45 
<212> DNA 
<213> HCV 

<400> 16 

gcatatgaat tctaatacga ctcactatag gccagccccc gattg 45 

■ 

<210> 17 
<211> 30 
<212> DNA 
<213> HCV 



<210> 18 
<211> 30 
<212> DNA 
<213> HCV 

<400> 18 

tcccggggca ctcgcaagca ccctatcagg 30 

■ 

<210> 19 
<211> 26 
<212> DNA 
<213> HCV 

<220> 

<223> Label with FAM: fluorescence reporter dye 
<223> Label with TAMRA: Quencher dye 



<210> 20 
<211> 45 
<212> DNA 
<213> HCV 

<400> 20 

gtggacgaat tctaatacga ctcactataa ccagcccccg attgg 45 

<210> 21 
<211> 27 
<212> DNA 
<213> HCV 



<400> 17 

acgcagaaag cgtctagcca tggcgttagt 



30 



<400> 19 

tggtctgcgg aacgggtgag tacacc 



26 



<400> 21 

ggaacgcccg tcgtggccag ccacgat 



27 
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<210> 22 
<211> 23 
<212> DNA 
<213> HCV 



<400> 22 

gtcgtcttct ctgacatgga gac 23 

<210> 23 
<211> 27 
<212> DNA 
<213> HCV 

<400> 23 

gagttgctca gtggattgat gggcagc 27 

<210> 24 
<211> 8638 
<212> DNA 
<213> HCV 



<220> 
<221> CDS 

<222> (1802) . . . (8407) 
<400> 24 

accagccccc gattgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccc gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gatcaacccg 
gcgagactgc tagccgagta gtgttgggtc 
gtgcttgcga gtgccccggg aggtctcgta 
ctcaaagaaa aaccaaaggg cgcgccatga 
cggccgcttg ggtggagagg ctattcggct 
ctgatgccgc cgtgttccgg ctgtcagcgc 
acctgtccgg tgccctgaat gaactgcagg 
cgacgggcgt tccttgcgca gctgtgctcg 
tgctattggg cgaagtgccg gggcaggatc 
aagtatccat catggctgat gcaatgcggc 
cattcgacca ccaagcgaaa catcgcatcg 
ttgtcgatca ggatgatctg gacgaagagc 
ccaggctcaa ggcgcgcatg cccgacggcg 
gcttgccgaa tatcatggtg gaaaatggcc 
tgggtgtggc ggaccgctat caggacatag 
ttggcggcga atgggctgac cgcttcctcg 
agcgcatcgc cttctatcgc cttcttgacg 
cagaccacaa cggtttccct ctagcgggat 
ccgaagccgc ttggaataag gccggtgtgc 
gccgtctttt ggcaatgtga gggcccggaa 
taggggtctt tcccctctcg ccaaaggaat 
agttcctctg gaagcttctt gaagacaaac 
gaacccccca cctggcgaca ggtgcctctg 
tgcaaaggcg gcacaacccc agtgccacgt 
atggctctcc tcaagcgtat tcaacaaggg 
tatgggatct gatctggggc ctcggtgcac 
aaacgtctag gccccccgaa ccacggggac 
c atg gac egg gag atg gca gca tc< 



catagatcac tcccctgtga ggaactactg 60 
ttagtatgag tgtcgtgcag cctccaggac 120 
eggaaceggt gagtacaccg gaattgccag 180 
ctcaatgcct ggagatttgg gcgtgccccc 240 
gegaaaggee ttgtggtact gectgatagg 300 
gaccgtgcac catgagcacg aatcctaaac 360 
ttgaacaaga tggattgeae gcaggttctc 420 
atgactgggc acaacagaca ateggctget 480 
aggggcgccc ggttcttttt gtcaagaccg 540 
acgaggcagc geggctateg tggctggcca 600 
acgttgtcac tgaagcggga agggactggc 660 
tcctgtcatc tcaccttgct cctgccgaga 720 
ggctgeatae gcttgatccg gctacctgcc 780 
agegagcacg tacteggatg gaagccggtc 84 0 
atcaggggct cgcgccagcc gaactgttcg 900 
aggatctcgt cgtgacccat ggegatgect 960 
gcttttctgg attcatcgac tgtggccggc 1020 
cgttggctac ccgtgatatt gctgaagagc 1080 
tgetttaegg tatcgccgct cccgattcgc 1140 
agttcttctg agttcgcgcc cagatgttaa 1200 
caattccgcc ccccccccta acgttactgg 1260 
gtttgtctat atgttatttt ccaccatatt 1320 
acctggccct gtcttcttga cgagcattcc 1380 
gcaaggtctg ttgaatgtcg tgaaggaagc 1440 
aacgtctgta gcgacccttt geaggcageg 1500 
eggecaaaag ccacgtgtat aagatacacc 1560 
tgtgagttgg atagttgtgg aaagagtcaa 1620 
gctgaaggat geccagaagg taccccattg 1680 
atgetttaca tgtgtttagt cgaggttaaa 1740 
gtggttttcc tttgaaaaac acgataatac 1800 
tgc gga ^gc gcg gtt ttc gta ggt 1849 
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Met Asp Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly 
15 10 15 

ctg ata etc ttg acc ttg tea ccg cac tat aag ctg ttc etc get agg 1897 
Leu lie Leu Leu Thr Leu Ser Pro His Tyr Lys Leu Phe Leu Ala Arg 

20 25 30 

etc ata tgg tgg tta caa tat ttt ate acc agg gec gag gca cac ttg 1945 
Leu lie Trp Trp Leu Gin Tyr Phe lie Thr Arg Ala Glu Ala His Leu 
35 40 45 

caa gtg tgg ate ccc ccc etc aac gtt egg ggg ggc cgc gat gee gtc 1993 
Gin Val Trp lie Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val 
50 55 60 

ate etc etc acg tgc gcg ate cac cca gag eta ate ttt acc ate acc 2041 
lie Leu Leu Thr Cys Ala lie His Pro Glu Leu lie Phe Thr lie Thr 
65 70 75 80 

aaa ate ttg etc gee ata etc ggt cca etc atg gtg etc cag get ggt 2089 
Lys lie Leu Leu Ala lie Leu Gly Pro Leu Met Val Leu Gin Ala Gly 

85 90 95 

ata acc aaa gtg ccg tac ttc gtg cgc gca cac ggg etc att cgt gca 2137 
lie Thr Lys Val Pro Tyr Phe Val Arg Ala His Gly Leu lie Arg Ala 

100 105 110 

tgc atg ctg gtg egg aag gtt get ggg ggt cat tat gtc caa atg get 2185 
Cys Met Leu Val Arg Lys Val Ala Gly Gly His Tyr Val Gin Met Ala 
115 120 125 

etc atg aag ttg gec gca ctg aca ggt acg tac gtt tat gac cat etc 2233 
Leu Met Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu 
130 135 140 

acc cca ctg egg gac tgg gee cac gcg ggc eta cga gac ctt gcg gtg 2281 
Thr Pro Leu Arg Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val 
145 150 155 160 

■ 

gca gtt gag ccc gtc gtc ttc tct gat atg gag acc aag gtt ate acc 2329 
Ala Val Glu Pro Val Val Phe Ser Asp Met Glu Thr Lys Val lie Thr 

165 170 175 

tgg ggg gca gac acc gcg gcg tgt ggg gac ate ate ttg ggc ctg ccc 2377 
Trp Gly Ala Asp Thr Ala Ala Cys Gly Asp lie lie Leu Gly Leu Pro 

180 185 190 

gtc tec gee cgc agg ggg agg gag ata cat ctg gga ccg gca gac age 2425 
Val Ser Ala Arg Arg Gly Arg Glu lie His Leu Gly Pro Ala Asp Ser 
195 200 205 

ctt gaa ggg cag ggg tgg cga etc etc gcg cct att acg gee tac tec 2473 
Leu Glu Gly Gin Gly Trp Arg Leu Leu Ala Pro lie Thr Ala Tyr Ser 
210 215 220 

caa cag acg cga ggc eta ctt ggc tgc ate ate act age etc aca ggc 2521 
Gin Gin Thr Arg Gly Leu Leu Gly Cys lie He Thr Ser Leu Thr Gly 
225 230 235 240 
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egg gac agg aac cag gtc gag ggg gag gtc caa gtg gtc tec acc gca 2569 
Arg Asp Arg Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala 

245 250 255 

■ 

aca caa tct ttc ctg gcg acc tgc gtc aat ggc gtg tgt tgg act gtc 2617 

Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val 

260 265 270 

tat cat ggt gec ggc tea aag acc ctt gec ggc cca aag ggc cca ate 2665 
Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro lie 
275 280 285 

acc caa atg tac acc aat gtg gac cag gac etc gtc ggc tgg caa gcg 2713 
Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin Ala 
290 295 300 

ccc ccc ggg gcg cgt tec ttg aca cca tgc acc tgc ggc age teg gac 2761 
Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp 
305 310 315 320 

ctt tac ttg gtc acg agg cat gec gat gtc att ccg gtg cgc egg egg 2809 
Leu Tyr Leu Val Thr Arg His Ala Asp Val lie Pro Val Arg Arg Arg 

325 330 335 

ggc gac age agg ggg age eta etc tec ccc agg ccc gtc tec tac ttg 2857 

Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu 

340 345 350 

■ 

aag ggc tct teg ggc ggt cca ctg etc tgc ccc teg ggg cac get gtg 2905 

Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ser Gly His Ala Val 
355 360 365 

ggc ate ttt egg get gee gtg tgc acc cga ggg gtt gcg aag gcg gtg 2953 
Gly lie Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val. 
370 375 380 

gac ttt gta ccc gtc gag tct atg gaa acc act atg egg tec ccg gtc 3001 
Asp Phe Val Pro Val Glu Ser Met Glu Thr Thr Met Arg Ser Pro Val 
385 390 395 400 

ttc acg gac aac teg tec cct ccg gee gta ccg cag aca ttc cag gtg 3049 
Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Thr Phe Gin Val 

405 410 .415 

gee cat eta cac gee cct act ggt age ggc aag age act aag gtg ccg 3097 
Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

420 425 430 

get gcg tat gca gec caa ggg tat aag gtg ctt gtc ctg aac ccg tec 3145 
Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser 
435 440 445 

gtc gec gee acc eta ggt ttc ggg gcg tat atg tct aag gca cat ggt 3193 
Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly 
450 455 460 
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ate gac cct aac ate aga acc ggg gta agg acc ate acc acg ggt gec 3241 
lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly Ala 
465 470 475 480 

ccc ate acg tac tec acc tat ggc aag ttt ctt gec gac ggt ggt tgc 3289 
Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys 

485 490 495 

tct ggg ggc gec tat gac ate ata ata tgt gat gag tgc cac tea act 3337 
Ser Gly Gly Ala Tyr Asp lie lie lie Cys Asp Glu Cys His Ser Thr 

500 505 510 

gac teg acc act ate ctg ggc ate ggc aca gtc ctg gac caa gcg gag 3385 
Asp Ser Thr Thr lie Leu Gly lie Gly Thr Val Leu Asp Gin Ala Glu 
515 520 525 

acg get gga gcg cga etc gtc gtg etc gee acc get acg cct ccg gga 3433 
Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 
530 535 540 

teg gtc acc gtg cca cat cca aac ate gag gag gtg get ctg tec age 3481 
Ser Val Thr Val Pro His Pro Asn lie Glu Glu Val Ala Leu Ser Ser 
545 550 555 560 

act gga gaa ate ccc ttt tat ggc aaa gee ate ccc ate gag acc ate 3529 
Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro lie Glu Thr He 

565 570 575 

aa £T 999 999 a 99 cac ctc att ttc tgc cat tec aag aag aaa tgt gat 3577 
Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp 

580 585 590 

gag ctc gee gcg aag ctg tec ggc ctc gga ctc aat get gta gca tat 3625 
Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly Leu Asn Ala Val Ala Tyr 
595 600 605 

tac egg ggc ctt gat gta tec gtc ata cca act age gga gac gtc att 3673 
Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val He 
610 615 620 

■ 

gtc gta gca acg gac get eta atg acg ggc ttt acc ggc gat ttc gac 3721 
Val Val Ala Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp 
625 630 635 640 

tea gtg ate gac tgc aat aca tgt gtc acc cag aca gtc gac ttc age 3769 
Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser 

645 650 655 

ctg gac ccg acc ttc acc att gag acg acg acc gtg cca caa gac gcg 3817 
Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala 

660 665 670 

gtg tea cgc teg cag egg cga ggc agg act ggt agg ggc agg atg ggc 3865 
Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly' Arg Met Gly 
675 680 685 
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att tac agg ttt gtg act cca gga gaa egg ccc teg ggc atg ttc gat 3913 
lie Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp 
690 695 700 

tec teg gtt ctg tgc gag tgc tat gac gcg ggc tgt get tgg tac gag 3961 
Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu 
705 710 715 720 

etc acg ccc gee gag ace tea gtt agg ttg egg get tac eta aac aca 4009 
Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr 

725 730 735 

cca ggg ttg ccc gtc tgc cag gac cat ctg gag ttc tgg gag age gtc 4057 
Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ser Val 

740 745 750 

ttt aca ggc etc acc cac ata gac gee cat ttc ttg tec cag act aag 4105 
Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys 
755 760 765 

cag gca gga gac aac ttc ccc tac ctg gta gca tac cag get acg gtg 4153 
Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 
770 775 780 

tgc gee agg get cag get cca cct cca teg tgg gac caa atg tgg aag 42 01 
Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys 
785 790 795 800 

tgt etc ata egg eta aag cct acg ctg cac ggg cca acg ccc ctg ctg 4249 
Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu 

805 810 815 

tat agg ctg gga gee gtt caa aac gag gtt act acc aca cac ccc ata 4297 
Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Thr Thr His Pro lie 

820 825 830 

acc aaa tac ate atg gca tgc atg teg get gac ctg gag gtc gtc acg 4345 
Thr Lys Tyr lie Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr 
835 840 845 

age acc tgg gtg ctg gta ggc gga gtc eta gca get ctg gee gcg tat 4393 
Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr 
850 855 860 

tgc ctg aca aca ggc age gtg gtc att gtg ggc agg ate ate ttg tec 4441 
Cys Leu Thr Thr Gly Ser Val Val He Val Gly Arg He He Leu Ser 
865 870 875 880 

gga aag ccg gee ate att ccc gac agg gaa gtc ctt tac egg gag ttc 4489 
Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe 

885 890 895 

gat gag atg gaa gag tgc gee tea cac etc cct tac ate gaa cag gga 4537 
Asp Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly 

900 905 910 
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atg cag etc gec gaa caa ttc aaa cag aag gca ate ggg ttg ctg caa 4585 
Met Gin Leu Ala Glu Gin Phe Lys Gin Lys Ala lie Gly Leu Leu Gin 
915 920 925 

aca gec acc aag caa gcg gag get get get ccc gtg gtg gaa tec aag 4633 
Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys 
930 935 940 

tgg egg acc etc gaa gee ttc tgg gcg aag cat atg tgg aat ttc ate 4681 
Trp Arg Thr Leu Glu Ala Phe Trp Ala Lys His Met Trp Asn Phe lie 
945 950 955 960 

a £J c 999 ata caa tat tta gca ggc ttg tec act ctg cct ggc aac ccc 4729 
Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro 

965 970 975 

gcg ata gca tea ctg atg gca ttc aca gee tct ate acc age ccg etc 4777 
Ala lie Ala Ser Leu Met Ala Phe Thr Ala Ser lie Thr Ser Pro Leu 

980 985 990 

acc acc caa cat acc etc ctg ttt aac ate ctg ggg gga tgg gtg gee 4825 
Thr Thr Gin His Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala 
995 1000 1005 

gee caa ctt get cct ccc age get get tct get ttc gta ggc gee ggc 4873 
Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly 
1010 1015 1020 

ate get gga gcg get gtt ggc age ata ggc ctt ggg aag gtg ctt gtg 4921 
He Ala Gly Ala Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val 
1025 1030 1035 1040 

gat att ttg gca ggt tat gga gca ggg gtg gca ggc gcg etc gtg gee 4969 
Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala 

1045 1050 1055 

ttt aag gtc atg age ggc gag atg ccc tec acc gag gac ctg gtt aac 5017 
Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn 

1060 1065 1070 

eta etc cct get ate etc tec cct ggc gee eta gtc gtc ggg gtc gtg 5065 
Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val 
1075 1080 1085 

tgc gca gcg ata ctg cgt egg cac gtg ggc cca ggg gag ggg get gtg 5113 
Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val 
1090 1095 1100 

cag tgg atg aac egg ctg ata gcg ttc get teg egg ggt aac cac gtc 5161 
Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val 
1105 mo 1115 1120 

tec ccc acg cac tat gtg cct gag age gac get gca gca cgt gtc act -5209 
Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr 

1125 1130 ~ 1135 
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cag ate etc tct agt ctt acc ate act cag ctg ctg aag agg ctt cac 5257 
Gin lie Leu Ser Ser Leu Thr He Thr Gin Leu Leu Lys Arg Leu His 

1140 1145 1150 

cag tgg ate aac gag gac tgc tec acg cca tgc tec ggc teg tgg eta 5305 
Gin Trp He Asn Glu Asp Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu 
1155 1160 1165 

aga gat gtt tgg gat tgg ata tgc acg gtg ttg act gat ttc aag acc 5353 
Arg Asp Val Trp Asp Trp He Cys Thr Val Leu Thr Asp Phe Lys Thr 
1170 1175 1180 

tgg etc cag tec aag etc ctg ccg cga ttg ccg gga gtc ccc ttc ttc 5401 
Trp Leu Gin Ser Lys Leu Leu Pro Arg Leu Pro Gly Val Pro Phe Phe 
1185 1190 1195 1200 

tea tgt caa cgt ggg tac aag gga gtc tgg egg ggc gac ggc ate atg 5449 
Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He Met 

1205 1210 1215 

caa acc acc tgc cca tgt gga gca cag ate acc gga cat gtg aaa aac 5497 
Gin Thr Thr Cys Pro Cys Gly Ala Gin He Thr Gly His Val Lys Asn 

1220 1225 1230 

ggt tec atg agg ate gtg ggg cct agg acc tgt agt aac acg tgg cat 5545 
Gly Ser Met Arg He Val Gly Pro Arg Thr Cys Ser Asn Thr Trp His 
1235 1240 1245 

gga aca ttc ccc att aac gcg tac acc acg ggc ccc tgc acg ccc tec 5593 
Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser 
1250 1255 1260 

ccg gcg cca aat tat tct agg gcg ctg tgg egg gtg get get gag gag 5641 
Pro Ala Pro Asn Tyr Ser Arg Ala Leu Trp Arg Val Ala Ala Glu Glu 
1265 1270 1275 1280 

tac gtg gag gtt acg egg gtg ggg gat ttc cac tac gtg acg ggc atg 5689 
Tyr Val Glu Val Thr Arg Val Gly Asp Phe His Tyr Val Thr Gly Met 

1285 1290 1295 

acc act gac aac gta aag tgc ccg tgt cag gtt ccg gec ccc gaa ttc 573 7 
Thr Thr Asp Asn Val Lys Cys Pro Cys Gin Val Pro Ala Pro Glu Phe 

1300 1305 1310 

ttc aca gaa gtg gat ggg gtg egg ttg cac agg tac get cca gcg tgc 5785 
Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys 
1315 1320 1325 

aaa ccc etc eta egg gag gag gtc aca ttc ctg gtc ggg etc 'aat caa 5833 
Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Leu Val Gly Leu Asn Gin 
1330 1335 1340 

tac ctg gtt ggg tea cag etc cca tgc gag ccc gaa ccg gac gta gca 5881 
Tyr Leu Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala 
1345 1350 1355 1360 
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gtg etc act tec atg etc acc gac ccc tec cac att acg gcg gag acg 5929 
Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Thr 

1365 1370 1375 

get aag cgt agg ctg gee agg gga tct ccc ccc tec ttg gee age tea 5977 
Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser Ser 

1380 1385 1390 

tea get age cag ctg tct gcg cct tec ttg aag gca aca tgc act acc 6025 
Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Thr 
1395 1400 1405 

cgt cat gac tec ccg gac get gac etc ate gag gec aac etc ctg tgg 6073 
Arg His Asp Ser Pro Asp Ala Asp Leu lie Glu Ala Asn Leu Leu Trp 
1410 1415 1420 

egg cag gag atg ggc ggg aac ate acc cgc gtg gag tea gaa aat aag 6121 
Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn Lys 
1425 1430 1435 1440 

gta gta att ttg gac tct ttc gag ccg etc caa gcg gag gag gat gag 6169 
Val Val He Leu Asp Ser Phe Glu Pro Leu Gin Ala Glu Glu Asp Glu 

1445 1450 1455 

agg gaa gta tec gtt ccg gcg gag ate ctg egg agg tec agg aaa ttc 6217 
Arg Glu Val Ser Val Pro Ala Glu He Leu Arg Arg Ser Arg Lys Phe 

1460 1465 1470 

cct cga gcg atg ccc ata tgg gca cgc ccg gat tac aac cct cca ctg 6265 
Pro Arg Ala Met Pro He Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu 
1475 1480 1485 

tta gag tec tgg aag gac ccg gac tac gtc cct cca gtg gta cac ggg 6313 
Leu Glu Ser Trp Lys Asp Pro Asp Tyr Val Pro Pro Val Val His Gly 
1490 1495 1500 

tgt cca ttg ccg cct gec aag gee cct ccg ata cca cct cca egg agg 6361 
Cys Pro Leu Pro Pro Ala Lys Ala Pro Pro He Pro Pro Pro Arg Arg 
1505 1510 1515 1520 

aag agg acg gtt gtc ctg tea gaa tct acc gtg tct tct gee ttg gcg 6409 
Lys Arg Thr Val Val Leu Ser Glu Ser Thr Val Ser Ser Ala Leu Ala 

1525 1530 1535 

gag etc gee aca aag acc ttc ggc age tec gaa teg teg gee gtc gac 6457 
Glu Leu Ala Thr Lys Thr Phe Gly Ser Ser Glu Ser Ser Ala Val Asp 

1540 1545 1550 

age ggc acg gca acg gee tct cct gac cag ccc tec gac gac ggc gac 6505 
Ser Gly Thr Ala Thr Ala Ser Pro Asp Gin Pro Ser Asp Asp Gly Asp 
1555 1560 1565 

gcg gga tec gac gtt gag teg tac tec tec atg ccc ccc ctt gag ggg 6553 
Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly 
1570 1575 1580 
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gag ccg ggg gat ccc gat etc age gac ggg tct tgg tct acc gta age 6601 
Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser 
1585 1590 1595 1600 

gag gag get agt gag gac gtc gtc tgc tgc teg atg tec tac aca tgg 6649 
Glu Glu Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp 

1605 1610 1615 

aca ggc gee ctg ate acg cca tgc get gcg gag gaa acc aag ctg ccc 6697 
Thr Gly Ala Leu lie Thr Pro Cys Ala Ala Glu Glu Thr Lys Leu Pro 

1620 1625 1630 

ate aat gca ctg age aac tct ttg etc cgt cac cac aac ttg gtc tat 6745 
He Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr 
1635 1640 1645 

get aca aca tct cgc age gca age ctg egg cag aag aag gtc acc ttt 6793 
Ala Thr Thr Ser Arg Ser Ala Ser Leu Arg Gin Lys Lys Val Thr Phe 
1650 1655 1660 

gac aga ctg cag gtc ctg gac gac cac tac egg gac gtg etc aag gag 6841 
Asp Arg Leu Gin Val Leu Asp Asp His Tyr Arg Asp Val Leu Lys Glu 
1665 1670 1675 1680 

atg aag gcg aag gcg tec aca gtt aag get aaa ctt eta tec gtg gag 6889 
Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu Leu Ser Val. Glu 

1685 1690 1695 

gaa gee tgt aag ctg acg ccc cca cat teg gee aga tct aaa ttt ggc 6937 
Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Arg Ser Lys Phe Gly 

1700 1705 1710 

tat ggg gca aag gac gtc egg aac eta tec age aag gee gtt aac cac 6985 
Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys Ala Val Asn His 
1715 1720 1725 

ate cgc tec gtg tgg aag gac ttg ctg gaa gac act gag aca cca att 7033 
He Arg Ser Val Trp Lys Asp Leu Leu Glu Asp Thr Glu Thr Pro He 
1730 1735 1740 

gac acc acc ate atg gca aaa aat gag gtt ttc tgc gtc caa cca gag 7081 
Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu 
1745 1750 1755 1760 

aag ggg ggc cgc aag cca get cgc ctt ate gta ttc cca gat ttg ggg 7129 
Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly 

1765 1770 1775 

gtt cgt gtg tgc gag aaa atg gee ctt tac gat gtg gtc tec acc etc 7177 
Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Thr Leu 

1780 1785 1790 

cct cag gee gtg atg ggc tct tea tac gga ttc caa tac tct cct gga 7225 
Pro Gin Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly 
1795 1800 1805 
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cag egg gtc gag ttc ctg gtg aat gec tgg aaa gcg aag aaa tgc cct 7273 
Gin Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ala Lys Lys Cys Pro 
1810 1815 1820 

atg ggc ttc gca tat gac acc cgc tgt ttt gac tea acg gtc act gag 7321 
Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1825 1830 1835 1840 

aat gac ate cgt gtt gag gag tea ate tac caa tgt tgt gac ttg gee 7369 
Asn Asp lie Arg Val Glu Glu Ser lie Tyr Gin Cys Cys Asp Leu Ala 

1845 1850 1855 

ccc gaa gee aga cag gee ata agg teg etc aca gag egg ctt tac ate 7417 
Pro Glu Ala Arg Gin Ala lie Arg Ser Leu Thr Glu Axg Leu Tyr He 

1860 1865 1870 

999 ggc ccc ctg act aat tct aaa ggg cag aac tgc ggc tat cgc egg 7465 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 
1875 1880 1885 

tgc cgc gcg age ggt gta ctg acg ace age tgc ggt aat acc etc aca 7513 
Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 
1890 1895 1900 

tgt tac ttg aag gee get gcg gee tgt cga get gcg aag etc cag gac 7561 
Cys Tyr Leu Lys Ala Ala Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
1905 1910 1915 1920 

tgc acg atg etc gta tgc gga gac gac ctt gtc gtt ate tgt gaa age 7609 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

1925 1930 1935 

gcg ggg acc caa gag gac gag gcg age eta egg gee ttc acg gag get 7657 
Ala Gly Thr Gin Glu Asp Glu Ala Ser Leu Arg Ala Phe Thr Glu Ala 

1940 1945 1950 

atg act aga tac tct gee ccc cct ggg gac ccg ccc aaa cca gaa tac 7705 
Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Lys Pro Glu Tyr 
1955 1960 1965 

gac ttg gag ttg ata aca tea tgc tec tec aat gtg tea gtc gcg cac 7753 
Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 
1970 1975 1980 

gat gca tct ggc aaa agg gtg tac tat etc acc cgt gac ccc acc acc 7801 
Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
1985 1990 1995 . 2000 

ccc ctt gcg egg get gcg tgg gag aca get aga cac act cca gtc aat 7849 
Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 

2005 2010 2015 



tec tgg eta 
Ser Trp Leu 



ggc aac ate ate atg tat gcg ccc acc ttg 
Gly Asn He lie Met Tyr Ala Pro Thr Leu 
2020 2025 



tgg gca agg 
Trp Ala Arg 
2030 



7897 
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atg ate ctg atg act cat ttc ttc tec ate ctt eta get cag gaa caa 7945 
Met lie Leu Met Thr His Phe Phe Ser lie Leu Leu Ala Gin Glu Gin 
2035 2040 2045 

ctt gaa aaa gec eta gat tgt cag ate tac ggg gee tgt tac tec att 7993 
Leu Glu Lys Ala Leu Asp Cys Gin lie Tyr Gly Ala Cys Tyr Ser lie 
2050 2055 2060 

gag cca ctt gac eta cct cag ate att caa cga etc cac ggc ctt age 8041 
Glu Pro Leu Asp Leu Pro Gin lie lie Gin Arg Leu His Gly Leu Ser 
2065 2070 2075 2080 

gca ttt tea etc cat agt tac tct cca ggt gag ate aat agg gtg get 8089 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu lie Asn Arg Val Ala 

2085 2090 2095 

tea tgc etc agg aaa ctt ggg gta ccg ccc ttg cga gtc tgg aga cat 8137 
Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

2100 2105 2110 

egg gee aga agt gtc cgc get agg eta ctg tec cag ggg ggg agg get 8185 
Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ser Gin Gly Gly Arg Ala 
2115 2120 2125 

gee act tgt ggc aag tac etc ttc aac tgg gca gta agg ace aag etc 8233 
Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 
2130 2135 2140 

aaa etc act cca ate ccg get gcg tec cag ttg gat tta tec age tgg 8281 
Lys Leu Thr Pro lie Pro Ala Ala Ser Gin Leu Asp Leu Ser Ser Trp 
2145 2150 2155 2160 

ttc gtt get ggt tac age ggg gga gac ata tat cac age ctg tct cgt 8329 
Phe Val Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser Arg 

2165 2170 2175 

gee cga ccc cgc tgg ttc atg tgg tgc eta etc eta ctt tct gta ggg 8377 
Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly 

2180 2185 2190 

gta ggc ate tat eta etc ccc aac cga tga aeggggaget aaacactcca 8427 
Val Gly He Tyr Leu Leu Pro Asn Arg * 
2195 2200 

ggecaatagg ccatcctgtt tttttccctt tttttttttc tttttttttt tttttttttt 8487 

tttttttttt ttttctcctt tttttttcct ctttttttcc ttttctttcc tttggtggct 8547 

ccatcttagc cctagtcacg gctagctgtg aaaggtccgt gagcegcttg actgeagaga 8607 

gtgetgatae tggcctctct gcagatcaag t 8638 

<210> 25 
<211> 8638 
<212> DNA 
<213> HCV 



<220> 
<221> CDS 

<222> (1802) . . . (8407) 
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<400> 25 

accagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 
tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 
cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 
gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 
gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 360 
ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 
cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 
ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 
acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 
cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 
tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 
aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 
cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 84 0 
ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 
ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 
gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 
tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 
ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 
agcgcatcgc cttctatcgc cttcttgacg agttcttctg agttcgcgcc cagatgttaa 1200 
cagaccacaa cggtttccct ctagcgggat caattccgcc ccccccccta acgttactgg- 1260 
ccgaagccgc ttggaataag gccggtgtgc gtttgtctat atgttatttt ccaccatatt 1320 
gccgtctttt ggcaatgtga gggcccggaa acctggccct gtcttcttga cgagcattcc 1380 
taggggtctt tcccctctcg ccaaaggaat gcaaggtctg ttgaatgtcg tgaaggaagc 1440 
agttcctctg gaagcttctt gaagacaaac aacgtctgta gcgacccttt gcaggcagcg 1500 
gaacccccca cctggcgaca ggtgcctctg cggccaaaag ccacgtgtat aagatacacc 1560 * 
tgcaaaggcg gcacaacccc agtgccacgt tgtgagttgg atagttgtgg aaagagtcaa 1620 
atggctctcc tcaagcgtat tcaacaaggg gctgaaggat gcccagaagg taccccattg 1680 
tatgggatct gatctggggc ctcggtgcac atgctttaca tgtgtttagt cgaggttaaa 174 0 
aaacgtctag gccccccgaa ccacggggac gtggttttcc tttgaaaaac acgataatac 1800 
c atg gac egg gag atg gca gca teg tgc gga'ggc gcg gtt ttc gta ggt 1849 
Met Asp Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly 
15 10 15 

ctg ata etc ttg acc ttg tea ccg cac tat aag ctg ttc etc get agg 1897 
Leu lie Leu Leu Thr Leu Ser Pro His Tyr Lys Leu Phe Leu Ala Arg 

20 25 30 

etc ata tgg tgg tta caa tat ttt ate acc agg gee gag gca cac ttg 1945 
Leu lie Trp Trp Leu Gin Tyr Phe lie Thr Arg Ala Glu Ala His Leu 
35 40 45 

caa gtg tgg ate ccc ccc etc aac gtt egg ggg ggc cgc gat gee gtc 1993 
Gin Val Trp lie Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val 
50 55 60 

ate etc etc acg tgc gcg ate cac cca gag eta ate ttt acc ate acc 2041 
lie Leu Leu Thr Cys Ala lie His Pro Glu Leu lie Phe Thr lie Thr 
65 70 75 80 

aaa ate ttg etc gee ata etc ggt cca etc atg gtg etc cag get ggt 2089 
Lys lie Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala Gly 

85 90 95 

ata acc aaa gtg ccg tac ttc gtg cgc gca cac ggg etc att cgt gca 2137 
He Thr Lys Val Pro Tyr Phe Val Arg Ala His Gly Leu He Arg Ala 

100 105 110 
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tgc atg ctg gtg egg aag gtt get ggg ggt cat tat gtc caa atg' get 2185 
Cys Met Leu Val Arg Lys Val Ala Gly Gly His Tyr Val Gin Met Ala 
115 120 125 

etc atg aag ttg gee gca ctg aca ggt acg tac gtt tat gac cat etc 2233 
Leu Met Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu 
130 135 140 

ace cca ctg egg gac tgg gee cac gcg ggc eta cga gac ctt gcg gtg 2281 
Thr Pro Leu Arg Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val 
145 150 155 160 

gca gtt gag ccc gtc gtc ttc tct gat atg gag acc aag gtt ate ace 2329 
Ala Val Glu Pro Val Val Phe Ser Asp Met Glu Thr Lys Val lie Thr 

165 170 175 

tgg ggg gca gac acc gcg gcg tgt ggg gac ate ate ttg ggc ctg ccc 2377 
Trp Gly Ala Asp Thr Ala Ala Cys Gly Asp lie He Leu Gly Leu Pro 

180 185 190 

gtc tec gec cgc agg ggg agg gag ata cat ctg gga ccg gca gac age 2425 
Val Ser Ala Arg Arg Gly Arg Glu He His Leu Gly Pro Ala Asp Ser 
195 200 205 

ctt gaa ggg cag ggg tgg cga etc etc gcg cct att acg gee tac tec 2473 
Leu Glu Gly Gin Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ser 
210 215 220 

caa cag acg cga ggc eta ctt ggc tgc ate ate acc age etc aca ggc 2521 
Gin Gin Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly 
225 230 235 240 

egg gac agg aac cag gtc gag ggg gag gtc caa gtg gtc tec acc gca 2569 
Arg Asp Arg Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala 

245 250 255 

aca caa tct ttc ctg gcg acc tgc gtc aat ggc gtg tgt tgg act gtc 2617 
Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val 

260 265 270 

tat cat ggt gec ggc tea aag acc ctt gec ggc cca aag ggc cca ate 2665 
Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro He 
275 280 285 

acc caa atg tac acc aat gtg gac cag gac etc gtc ggc tgg caa gcg 2713 
Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin Ala 
290 295 300 

ccc ccc ggg gcg cgt tec ttg aca cca tgc acc tgc ggc age teg gac 2761 
Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp 
305 310 315 320 

ctt tac ttg gtc acg aag cat gec gat gtc att ccg gtg cgc egg egg 2809 
Leu Tyr Leu Val Thr Lys His Ala Asp Val He Pro Val Arg Arg Arg 

325 330 335 
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ggc gac age agg ggg age eta etc tec ccc egg ccc gtc tec tac ttg 2857 
Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu 

340 345 350 

aag ggc tct teg ggc ggt cca ctg etc tgc ccc teg ggg cac get gtg 2905 
Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ser Gly His Ala Val 
355 360 365 

ggc ate ttt egg get gee gtg, tgc ace cga ggg gtt gcg aag gcg gtg 2953 
Gly lie Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val 
370 375 380 

gac ttt gta ccc gtc gag tct atg gaa ace act atg egg tec ccg gtc 3001 
Asp Phe Val Pro Val Glu Ser Met Glu Thr Thr Met Arg Ser Pro Val 
385 390 395 400 

ttc acg gac aac teg tec cct ccg gee gta ccg cag aca ttc cag gtg 3049 
Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Thr Phe Gin Val 

405 410 415 

gee cat eta cac gee cct act ggt' age ggc aag age act aag gtg ccg 3097 
Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

420 425 430 

get gcg tat gca gee caa ggg tat aag gtg ctt gtc ctg aac ccg tec 3145 
Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser 
435 440 445 

gtc gec gee acc eta ggt ttc ggg gcg tat atg tct aag gca cat ggt 3193 
Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly 
450 455 460 

ate gac cct aac ate aga acc ggg gta agg acc ate acc acg ggt gee 3241 
He Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ala 
465 470 475 480 

ccc ate acg tac tec acc tat ggc aag ttt ctt gee gac ggt ggt tgc 3289 
Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys 

485 490 495 

tct ggg ggc gee tat gac ate ata ata tgt gat gag tgc cac tea act 3337 
Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr 

500 505 510 

gac teg acc act ate ctg ggc ate ggc aca gtc ctg gac caa gcg gag 3385 
Asp Ser Thr Thr He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu 
515 520 525 

acg get gga gcg cga etc gtc gtg etc gee acc get acg cct ccg gga 3433 
Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 
530 535 540 

teg gtc acc gtg cca cat cca aac ate gag gag gtg get ctg tec age 3481 
Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser Ser 
545 550 555 560 
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act gga gaa ate ccc ttt tat ggc aaa gec ate ccc ate gag acc ate 3529 
Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro lie Glu Thr lie 

565 570 575 

aag ggg ggg agg cac etc att ttc tgc cat tec aag aag aaa tgc gat 3577 
Lys Gly Gly Arg His Leu lie Phe Cys His Ser Lys Lys Lys Cys Asp 

580 585 590 

gag etc gee gcg aag ctg tec ggc etc gga etc aat get gta gca tat 3625 
Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly Leu Asn Ala Val Ala Tyr 
595 600 605 

tac egg ggc ctt gat gta tec gtc ata cca act age gga gac gtc att 3673 
Tyr Arg Gly Leu Asp Val Ser Val lie Pro Thr Ser Gly Asp Val lie 
610 615 620 

gtc gta gca acg gac get eta atg acg ggc ttt acc ggc gat ttc gac 3721 
Val Val Ala Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp 
625 630 635 640 



645 650 



660 665 



675 680 



690 695 



705 710 715 



725 730 



740 745 



755 760 



770 775 



785 790 795 
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tgt etc ata egg eta aag cct acg ctg cac ggg cca acg ccc ctg ctg 4249 
Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu 

805 810 815 

tat agg ctg gga gee gtt caa aac gag gtt act ace aca cac ccc ata 4297 
Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Thr Thr His Pro lie 

820 825 830 

ace aaa tac ate atg gca tgc atg teg get gac ctg gag gtc gtc acg 4345 
Thr' Lys Tyr lie Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr 
835 840 845 

age acc tgg gtg ctg gta ggc gga gtc eta gca get ctg get gcg tat 4393 
Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr 
850 855 860 

tgc ctg aca aca ggc age gtg gtc att gtg ggc agg ate ate ttg tec 4441 
Cys Leu Thr Thr Gly Ser Val Val lie Val Gly Arg lie He Leu Ser 
865 870 875 880 

gga agg ccg gee ate att ccc gac agg. gaa gtc ctt tac egg gag ttc 4489 
Gly Arg Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe 

885 890 895 

gat gag atg gaa gag tgt gee tea cac etc cct tac ate gaa cag gga • 4537 
Asp Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly 

900 905 910 

atg cag etc gec gaa caa ttc aaa cag aag gca ate ggg ttg ctg caa 4585 
Met Gin Leu Ala Glu Gin Phe Lys Gin Lys Ala He Gly Leu Leu Gin 
915 920 925 

aca gee acc aag caa gcg gag get get get ccc gtg gtg gaa tec aag 4633 
Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys 
930 935 940 

tgg egg acc etc gaa gee ttc tgg gcg aag cat atg tgg aat ttc ate 4681 
Trp Arg Thr Leu Glu Ala Phe Trp Ala- Lys His Met Trp Asn Phe He 
945 950 955 960 

age ggg ata caa tat tta gca ggc ttg tec act ctg cct ggc aac ccc 4729 
Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro 

965 970 975 

gcg ata gca tea ctg atg gca ttc aca gee tct ate acc age ccg etc 4777 
Ala He Ala Ser Leu Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu 

980 985 990 

acc acc caa cat acc etc ctg ttt aac ate ctg ggg gga tgg gtg gec 4825 
Thr Thr Gin His Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala 
995 1000 - 1005 

gee caa ctt get cct ccc age get get tec get ttc gta ggc gee ggc 4873 
Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly 
1010 1015 1020 
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ate get gga gcg get gtt ggc age ata ggc ctt ggg aag gtg ctt gtg 4921 
lie Ala Gly Ala Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val 
1025 1030 1035 1040 

gat att ttg gca ggt tat gga gca ggg gtg gca ggc gcg etc gtg gec 4969 
Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala 

1045 1050 1055 

ttt aag gtc atg age ggc gag atg ccc tec acc gag gac ctg gtt aac 5017 
Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn 

1060 1065 1070 

eta etc cct get ate etc tec cct ggc gec eta gtc gtc ggg gtc gtg 5065 
Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val 
1075 1080 1085 

tgc gca gcg ata ctg cgt egg cac gtg ggc cca ggg gag ggg get gtg 5113 
Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val 
1090 1095 1100 

cag tgg atg aac egg ctg ata gcg ttc get teg egg ggt aac cac gtc 5161 
Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val 
1105 1110 1115 1120 

tec ccc acg cac tat gtg cct gag age gac get gca gca cgt gtc act 5209 
Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr 

1125 1130 1135 

cag ate etc tct agt ctt acc ate act cag ctg ctg aag agg ctt cac 5257 
Gin He Leu Ser Ser Leu Thr He Thr Gin Leu Leu Lys Arg Leu His 

1140 1145 1150 

cag tgg ate aac gag gac tgc tec acg cca tgc tec ggc teg tgg eta 53 05 
Gin Trp He Asn Glu Asp Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu 
1155 1160 1165 

aga gat gtt tgg gat tgg ata tgc acg gtg ttg act gat ttc aag gec 5353 
Arg Asp Val Trp Asp Trp He Cys Thr Val Leu Thr Asp Phe Lys Ala 
1170 1175 1180 

tgg etc cag tec aag etc ctg ccg cga ttg ccg gga gtc ccc ttc ttc 54 01 
Trp Leu Gin Ser Lys Leu Leu Pro Arg Leu Pro Gly Val Pro Phe Phe 
1185 1190 1195 1200 

tea tgt caa cgt ggg tac aag gga gtc tgg egg ggc gac ggc ate atg 5449 
Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He Met 

1205 1210 1215 

caa acc acc tgc cca tgt gga gca cag ate acc gga cat gtg aaa aac 5497 
Gin Thr Thr Cys Pro Cys Gly Ala Gin He Thr Gly His Val Lys Asn 

1220 1225 1230 

tgt tec atg agg ate gtg ggg cct agg acc tgt agt aac acg tgg cat 5545 
Cys Ser Met Arg He Val Gly Pro Arg Thr Cys Ser Asn Thr Trp His 
1235 1240 1245 
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gga aca ttc ccc att aac gcg tac acc acg ggc ccc tgc acg ccc tec 5593 
Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser 
1250 1255 1260 

ccg gcg cca aat tat tct agg gcg ctg tgg egg gtg get get gag gag 5641 
Pro Ala Pro Asn Tyr Ser Arg Ala Leu Trp Arg Val Ala Ala Glu Glu 
1265 1270 1275 1280 

tac gtg gag gtt acg cga gtg ggg gat ttc cac tac gtg acg ggc atg 5689 
Tyr Val Glu Val Thr Arg Val Gly Asp Phe His Tyr Val Thr Gly Met 

1285 1290 1295 

acc act gac aac gta aag tgc ccg tgt cag gtt ccg gee ccc gaa ttc 5737 
Thr Thr Asp Asn Val Lys Cys Pro Cys Gin Val Pro Ala Pro Glu Phe 

1300 1305 1310 

ttc aca gaa gtg gat ggg gtg egg ttg cac agg tac get cca gcg tgc 5785 
Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys 
1315 1320 1325 

aaa ccc etc eta egg gag gag gtc aca ttc ctg gtc ggg etc aat caa 5833 
Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Leu Val Gly Leu Asn Gin 
1330 1335 1340 

tac ccg gtt ggg tea cag etc cca tgc gag ccc gaa ctg gac gta gca 5881 
Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Leu Asp Val Ala 
1345 1350 1355 1360 

* 

gtg etc act tec atg etc acc gac ccc tec cac att acg gcg gag acg 5929 
Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Thr . 

1365 1370 1375 

get aag- cgt agg ctg gee agg gga tct ccc ccc tec ttg gee age tea 5977 
Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser Ser 

1380 1385 1390 

tea get age cag ctg tct gcg cct tec ttg aag gca aca tgc act acc 6025 
Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Thr 
1395 1400 1405 

cgt cat gac tec ccg gac get gac etc ate gag gee aac etc ctg tgg 6073 
Arg His Asp Ser Pro Asp Ala Asp Leu lie Glu Ala Asn Leu Leu Trp 
1410 1415 1420 

egg cag gag atg ggc ggg aac ate acc cgc gtg gag tea gag aat aag 6121 
Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn Lys 
1425 1430 1435 1440 

gta gta att ttg gac tct ttc gag ccg etc caa gcg gag gag gat gag 6169 
Val Val lie Leu Asp Ser Phe Glu Pro Leu Gin Ala Glu Glu Asp Glu 

1445 ■ 1450 1455 

agg gaa gta tec gtt ccg gcg gag ate ctg egg agg tec agg aaa ttc 6217 
Arg Glu Val Ser Val Pro Ala Glu He Leu Arg Arg Ser Arg Lys Phe 

1460 1465 1470 
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cct cga gcg atg ccc ata tgg gca cgc ccg gat tac aac cct cca ctg 6265 
Pro Arg Ala Met Pro lie Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu 
1475 14B0 1485 

tta gag tec tgg aag gac ccg gac tac gtc cct cca gtg gta cac ggg 6313 
Leu Glu Ser Trp Lys Asp Pro Asp Tyr Val Pro Pro Val Val His Gly 
1490 1495 1500 

tgt cca ttg ccg cct gec aag gec cct ccg ata cca cct cca egg agg 6361 
Cys Pro Leu Pro Pro Ala Lys Ala Pro Pro lie Pro Pro Pro Arg Arg 
1505 1510 1515 1520 

aag agg acg gtt gtc ctg tea gaa tct acc gtg tct tct gec ttg gcg 6409 
Lys Arg Thr Val Val Leu Ser Glu Ser Thr Val Ser Ser Ala Leu Ala 

1525 1530 1535 

gag etc gec aca aag acc ttc ggc age tec gaa teg teg gee gtc gac 6457 
Glu Leu Ala Thr Lys Thr Phe Gly Ser Ser Glu Ser Ser Ala Val Asp 

1540 1545 1550 

a 9 c 9S C ac 9 9 ca ac 9 9 CC tct cct 9 ac cag ccc tec gac gac ggc gac 6505 
Ser Gly Thr Ala Thr Ala Ser Pro Asp Gin Pro Ser Asp Asp Gly Asp 
1555 1560 1565 

gcg gga tec gac gtt gag teg tac tec tec atg ccc ccc ctt gag ggg 6553 
Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly 
1570 1575 1580 

gag ccg ggg gat ccc gat etc age gac ggg tct tgg tct acc gta age 6601 
Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser 
1585 1590 1595 1600 

gag gag get agt gag gac gtc gtc tgc tgc teg atg tec tac aca tgg 6649 
Glu Glu Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp 

1605 1610 1615 

aca ggc gee ctg ate acg cca tgc get gcg gag gaa acc aag ctg ccc 6697 
Thr Gly Ala Leu lie Thr Pro Cys Ala Ala Glu Glu Thr Lys Leu Pro 

1620 1625 1630 

ate aat gca ctg age aac tct ttg etc cgt cac cac aac ttg gtc tat 6745 
lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr 
1635 1640 1645 

get aca aca tct cgc age gca. age ctg egg cag aag aag gtc acc ttt 6793 
Ala Thr Thr Ser Arg Ser Ala Ser Leu Arg Gin Lys Lys Val Thr Phe 
1650 1655 1660 

gac aga ctg cag gtc ctg gac gac cac tac egg gac gtg etc aag gag 6841 
Asp Arg Leu Gin Val Leu Asp Asp His Tyr Arg Asp Val Leu Lys Glu 
1665 1670 1675 1680 

atg aag gcg aag gcg tec aca gtt aag get aaa ctt eta tec gtg gag 6889 
Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu Leu Ser Val Glu 

1685 1690 1695 
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gaa gcc tgt aag ctg acg ccc cca cat teg gec aga tct aaa ttt ggc 6937 
Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Arg Ser Lys Phe Gly 

1700 1705 1710 

tat ggg gca aag gac gtc egg aac eta tec age aag gcc gtt aac cac 6985 
Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys Ala Val Asn His 
1715 1720 1725 

ate cgc tec gtg tgg aag gac ttg ctg gaa gac act gag aca cca att 7033 
lie Arg Ser Val Trp Lys Asp Leu Leu Glu Asp Thr Glu Thr Pro He 
1730 1735 1740 

gac acc acc ate atg gca aaa aat gag gtt ttc tgc gtc caa cca gag 7081 
Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu 
1745 1750 1755 1760 

aag ggg ggc cgc aag cca get cgc ctt ate gta ttc cca gat ttg ggg 7129 
Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly 

1765 1770 1775 

gtt cgt gtg tgc gag aaa atg gcc ctt tac gat gtg gtc tec acc etc 7177 
Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Thr Leu 

1780 1785 1790 

cct cag gcc gtg atg ggc tct tea tac gga ttc caa tac tct cct gga 7225 
Pro Gin Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly 
1795 1800 1805 

cag egg gtc gag ttc ctg gtg aat gcc tgg aaa gcg aag aaa tgc cct 7273 
Gin Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ala Lys Lys Cys Pro 
* 1810 1815 1820 

atg ggc ttc gca tat gac acc cgc tgt ttt gac tea acg gtc act gag 7321 
Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser Thr. Val Thr Glu 
1825 1830 1835 1840 

aat gac ate cgt gtt gag gag tea- ate tac caa tgt tgt gac ttg gcc 7369 
Asn Asp He Arg Val Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Ala 

1845 1850 1855 

ccc gaa gcc aga cag gcc ata agg teg etc aca gag egg ctt tac ate 7417 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

1860 1865 1870 

999 99C ccc ctg act aat tct aaa ggg cag aac tgc ggc tat cgc egg 7465 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 
1875 - 1880 1885 

tgc cgc gcg age ggt gta ctg acg acc age tgc ggt aat acc etc aca 7513 
Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 
1890 1895 1900 

tgt tac ttg aag gcc get gcg gcc tgt cga get gcg aag etc cag gac 7561 
Cys Tyr Leu Lys Ala Ala Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
1905 1910 1915 1920 



» 
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tgc acg atg etc gta tgc gga gac gac ctt gtc gtt ate tgt gaa age 7609 

Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys Glu Ser 

1925 1930 " 1935 

gcg ggg acc caa gag gac gag gcg age eta egg gec ttc acg gag get 7657 
Ala Gly Thr Gin Glu Asp Glu Ala Ser Leu Arg Ala Phe Thr Glu Ala 

1940 1945 1950 

atg act aga tac tct gec ccc cct ggg gac ccg ccc aaa cca gaa tac 7705 
Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Lys Pro Glu Tyr 
1955 1960 1965 

gac ttg gag ttg ata aca tea tgc tec tec aat gtg tea gtc gcg cac 7753 
Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 
1970 1975 1980 

gat gca tct ggc aaa agg gtg tac tat etc acc cgt gac ccc acc acc 7801 
Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
1985 1990 1995 2000 

ccc ctt gcg egg get gcg tgg gag aca get aga cac act cca gtc aat 7849 
Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 

2005 2010 2015 

tec tgg eta ggc aac ate ate atg tat gcg ccc acc ttg tgg gca agg 7897 
Ser Trp Leu Gly Asn He He Met Tyr Ala Pro Thr Leu Trp Ala Arg 

2020 • 2025 2030 

atg ate ctg atg act cat ttc ttc tec ate ctt eta get cag gaa caa 7945 
Met He Leu Met Thr His Phe Phe Ser Tie Leu Leu Ala Gin Glu Gin 
2035 2040 2045 

ctt gaa aaa gec eta gat tgt cag ate tac ggg gee tgt tac tec att 7993 
Leu Glu Lys Ala Leu Asp Cys Gin He Tyr Gly Ala Cys Tyr Ser He 
2050 2055 2060 

gag cca ctt gac eta cct cag ate att caa cga etc cac ggc ctt age 8041 
Glu Pro Leu Asp Leu Pro Gin He He Gin Arg Leu His Gly Leu Ser 
2065 2070 2075 2080 

gca ttt tea etc cat agt tac tct cca ggt gag ate aat agg gtg get 8089 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala 

2085 2090 2095 

tea tgc etc agg aaa ctt ggg gta ccg ccc ttg cga gtc tgg aga cat 8137 
Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

2100 2105 2110 

egg gec aga agt gtc cgc get agg eta ctg tec cag ggg ggg agg get 8185 
Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ser Gin Gly Gly Arg Ala 
2115 2120 2125 

gee act tgt ggc aag tac etc ttc aac tgg gca gta agg acc aag etc 8233 
Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 
2130 A 2135 2140 
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aaa etc act cca ate ccg get gcg tec cag ttg gat tta tec age tgg 8281 
Lys Leu Thr Pro He Pro Ala Ala Ser Gin Leu Asp Leu Ser Ser Trp 
2145 2150 2155 2160 

ttc gtt get ggt tac age ggg gga gac ata tat cac age ctg tct cgt 8329 
Phe Val Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser Arg 

2165 2170 2175 

gee cga ccc cgc tgg ttc atg tgg tgc eta etc eta ctt tct gta ggg 8377 
Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly 

2180 2185 2190 



gta ggc ate tat eta etc ccc aac cga tga aeggggaget aaacactcca 8427 
Val Gly He Tyr Leu Leu Pro Asn Arg * 
2195 2200 

ggecaatagg ccatcctgtt tttttccctt tttttttttc tttttttttt tttttttttt 8487 

tttttttttt ttttctcctt tttttttcct ctttttttcc ttttctttcc tttggtggct 8547 

ccatcttagc cctagtcacg gctagctgtg aaaggtccgt gagcegcttg actgeagaga 8607 

gtgetgatae tggcctctct gcagatcaag t 8638 
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This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 

1. Claims: 1-22 

A hepatitis C (HCV) replicon comprising: a 5 '-non translated 
region (NTR) wherein guanin at position 1 is substituted for 
adenine, a HCV polyprotein region coding for a HCV 
polyprotein further comprising one or more amino acid 
substitutions (adaptive mutations) in a non-structural 
protein and a 3'NTR; a eurkaryotic host cell transfected 
with said replicon; a RNA replication assay making use of 
said host cell and a method for testing compounds that 
inhibit HCV replication using said host cell. 



2. Claims: 23-42 (in part) 

A hepatitis C (HCV) replicon comprising: a 5' -non translated 
region (NTR), a HCV polyprotein region coding for a HCV 
polyprotein comprising a R(1135)K amino acid substitution 
(adaptive mutation) and a 3' NTR; a eurkaryotic host cell 
transfected with said replicon; a RNA replication assay 
making use of said host cell and a method for testing 
compounds that inhibit HCV replication using said host cell. 



3. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
S(1148)6 substitution. 



4. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
S(1560)G substitution. 



5. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
K(1691)R substitution. 



6. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
L(1701)F substitution. 



7. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
I (1984) V substitution. 
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8. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
T(1993)A substitution. 

9. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
G(2042)C or a G(2042)R substitution. 

10. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
S(2404)P substitution. 

11. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
L(2155)P substitution. 

12. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
P(2166)L substitution. 

13. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
M(2992)T substitution. 

14. Claims: 23-42 (in part) 

Idem as subject-matter 2 wherein the adaptive mutation is a 
E(1202)G substitution. 
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